Convert string to list of bits and viceversa
not sure why, but here are two ugly oneliners using only builtins:
s = "Hi"
l = map(int, ''.join([bin(ord(i)).lstrip('0b').rjust(8,'0') for i in s]))
s = "".join(chr(int("".join(map(str,l[i:i+8])),2)) for i in range(0,len(l),8))
yields:
>>> l
[0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1]
>>> s
'Hi'
In real world code, use the struct
or the bitarray
module.
There are many ways to do this with library functions. But I am partial to the third-party bitarray
module.
>>> import bitarray
>>> ba = bitarray.bitarray()
Conversion from strings requires a bit of ceremony. Once upon a time, you could just use fromstring
, but that method is now deprecated, since it has to implicitly encode the string into bytes. To avoid the inevitable encoding errors, it's better to pass a bytes
object to frombytes
. When starting from a string, that means you have to specify an encoding explicitly -- which is good practice anyway.
>>> ba.frombytes('Hi'.encode('utf-8'))
>>> ba
bitarray('0100100001101001')
Conversion to a list is easy. (Also, bitstring objects have a lot of list-like functions already.)
>>> l = ba.tolist()
>>> l
[False, True, False, False, True, False, False, False,
False, True, True, False, True, False, False, True]
bitstring
s can be created from any iterable:
>>> bitarray.bitarray(l)
bitarray('0100100001101001')
Conversion back to bytes or strings is relatively easy too:
>>> bitarray.bitarray(l).tobytes().decode('utf-8')
'Hi'
And for the sake of sheer entertainment:
>>> def s_to_bitlist(s):
... ords = (ord(c) for c in s)
... shifts = (7, 6, 5, 4, 3, 2, 1, 0)
... return [(o >> shift) & 1 for o in ords for shift in shifts]
...
>>> def bitlist_to_chars(bl):
... bi = iter(bl)
... bytes = zip(*(bi,) * 8)
... shifts = (7, 6, 5, 4, 3, 2, 1, 0)
... for byte in bytes:
... yield chr(sum(bit << s for bit, s in zip(byte, shifts)))
...
>>> def bitlist_to_s(bl):
... return ''.join(bitlist_to_chars(bl))
...
>>> s_to_bitlist('Hi')
[0, 1, 0, 0, 1, 0, 0, 0, 0, 1, 1, 0, 1, 0, 0, 1]
>>> bitlist_to_s(s_to_bitlist('Hi'))
'Hi'
You could use the built-in bytearray
:
>>> for i in bytearray('Hi', 'ascii'):
... print(i)
...
72
105
>>> bytearray([72, 105]).decode('ascii')
'Hi'
And bin()
to convert to binary.
There are probably faster ways to do this, but using no extra modules:
def tobits(s):
result = []
for c in s:
bits = bin(ord(c))[2:]
bits = '00000000'[len(bits):] + bits
result.extend([int(b) for b in bits])
return result
def frombits(bits):
chars = []
for b in range(len(bits) / 8):
byte = bits[b*8:(b+1)*8]
chars.append(chr(int(''.join([str(bit) for bit in byte]), 2)))
return ''.join(chars)