Hex to Base64 conversion in Python
Edit 26 Aug 2020: As suggested by Ali in the comments, using codecs.encode(b, "base64")
would result in extra line breaks for MIME syntax. Only use this method if you do want those line breaks formatting.
For a plain Base64 encoding/decoding, use base64.b64encode
and base64.b64decode
. See the answer from Ali for details.
In Python 3, arbitrary encodings including Hex and Base64 has been moved to codecs
module. To get a Base64 str
from a hex str
:
import codecs
hex = "10000000000002ae"
b64 = codecs.encode(codecs.decode(hex, 'hex'), 'base64').decode()
The tool you link to simply interprets the hex as bytes, then encodes those bytes to Base64.
Either use the binascii.unhexlify()
function to convert from a hex string to bytes, or use the bytes.fromhex()
class method. Then use the binascii.b2a_base64()
function to convert that to Base64:
from binascii import unhexlify, b2a_base64
result = b2a_base64(unhexlify(hex_string))
or
from binascii import b2a_base64
result = b2a_base64(bytes.fromhex(hex_string))
In Python 2, you can also use the str.decode()
and str.encode()
methods to achieve the same:
result = hex_string.decode('hex').encode('base64')
In Python 3, you'd have to use the codecs.encode()
function for this.
Demo in Python 3:
>>> bytes.fromhex('10000000000002ae')
b'\x10\x00\x00\x00\x00\x00\x02\xae'
>>> from binascii import unhexlify, b2a_base64
>>> unhexlify('10000000000002ae')
b'\x10\x00\x00\x00\x00\x00\x02\xae'
>>> b2a_base64(bytes.fromhex('10000000000002ae'))
b'EAAAAAAAAq4=\n'
>>> b2a_base64(unhexlify('10000000000002ae'))
b'EAAAAAAAAq4=\n'
Demo on Python 2.7:
>>> '10000000000002ae'.decode('hex')
'\x10\x00\x00\x00\x00\x00\x02\xae'
>>> '10000000000002ae'.decode('hex').encode('base64')
'EAAAAAAAAq4=\n'
>>> from binascii import unhexlify, b2a_base64
>>> unhexlify('10000000000002ae')
'\x10\x00\x00\x00\x00\x00\x02\xae'
>>> b2a_base64(unhexlify('10000000000002ae'))
'EAAAAAAAAq4=\n'
Python 2 has native support for both HEX and base64 encoding:
encoded = HEX_STRING.decode("hex").encode("base64")
(if you are using Python 3, see Eana Hufwe or Ali's answers instead)
from base64 import b64encode, b64decode
# hex -> base64
s = 'cafebabe'
b64 = b64encode(bytes.fromhex(s)).decode()
print('cafebabe in base64:', b64)
# base64 -> hex
s2 = b64decode(b64.encode()).hex()
print('yv66vg== in hex is:', s2)
assert s == s2
This prints:
cafebabe in base64: yv66vg== yv66vg== in hex is: cafebabe
The relevant functions in the documentation, hex to base64:
- b64encode
- bytes.fromhex
- bytes.decode
Base64 to hex:
- b64decode
- str.encode
- bytes.hex
I don't understand why many of the other answers are making it so complicated. For example the most upvoted answer as of Aug 26, 2020:
- There is no need for the
codecs
module here. - The
codecs
module usesbase64.encodebytes(s)
under the hood (see reference here), so it converts to multiline MIME base64, so you get a new line after every 76 bytes of output. Unless you are sending it in e-mail, it is most likely not what you want.
As for specifying 'utf-8'
when encoding a string, or decoding bytes: It adds unnecessary noise. Python 3 uses utf-8 encoding for strings by default. It is not a coincidence that the writers of the standard library made the default encoding of the encode/decode methods also utf-8, so that you don't have to needlessly specify the utf-8 encoding over and over again.