How to read integers from a file that are 24bit and little endian using Python?
Python 3 Method
In Python 3 I prefer using int.from_bytes()
to convert a 3 byte representation into a 32 bit integer. No padding needed.
value = int.from_bytes(input_data[0:3],'big',signed=True)
or just
value = int.from_bytes(input_data)
If your array is only 3 bytes and representation is default.
If you don't mind using an external library then my bitstring module could be helpful here.
from bitstring import ConstBitStream
s = ConstBitStream(filename='some_file')
a = s.read('uintle:24')
This reads in the first 24 bits and interprets it as an unsigned little-endian integer. After the read s.pos
is set to 24 (the bit position in the stream), so you can then read more. For example if you wanted to get a list of the next 10 signed integers you could use
l = s.readlist('10*intle:24')
or if you prefer you could just use slices and properties and not bother with reads:
a = s[0:24].uintle
Another alternative if you already have the 3 bytes of data from you file is just to create and interpret:
a = ConstBitStream(bytes=b'abc').uintle
Python's struct
module lets you interpret bytes as different kinds of data structure, with control over endianness.
If you read a single three-byte number from the file, you can convert it thus:
struct.unpack('<I', bytes + '\0')
The module doesn't appear to support 24-bit words, hence the '\0'
-padding.
EDIT: Signed numbers are trickier. You can copy the high-bit, and set the high bit to zero because it moves to the highest place of 4 bytes (the last \xff
has it).:
struct.unpack('<i', bytes + ('\0' if bytes[2] < '\x80' else '\xff'))
Or, for python3 (bytes
is a reserved word, checking a byte of a byte array gives an int
):
struct.unpack('<i', chunk + ('\0' if chunk[2] < 128 else '\xff'))
Are your 24-bit integers signed or unsigned? Bigendian or littleendian?
struct.unpack('<I', bytes + '\x00')[0] # unsigned littleendian
struct.unpack('>I', '\x00' + bytes)[0] # unsigned bigendian
Signed is a little more complicated ... get the unsigned value as above, then do this:
signed = unsigned if not (unsigned & 0x800000) else unsigned - 0x1000000