Reading a binary file into a struct
Actually it looks like you're trying to read a list (or array) of structures from the file. The idiomatic way to do this in Python is use the struct
module and call struct.unpack()
in a loop—either a fixed number of times if you know the number of them in advance, or until end-of-file is reached—and store the results in a list
. Here's an example of the latter:
import struct
struct_fmt = '=5if255s' # int[5], float, byte[255]
struct_len = struct.calcsize(struct_fmt)
struct_unpack = struct.Struct(struct_fmt).unpack_from
results = []
with open(filename, "rb") as f:
while True:
data = f.read(struct_len)
if not data: break
s = struct_unpack(data)
results.append(s)
The same results can be also obtained slightly more concisely using a list comprehension along with a short generator function helper (i.e. read_chunks()
below):
def read_chunks(f, length):
while True:
data = f.read(length)
if not data: break
yield data
with open(filename, "rb") as f:
results = [struct_unpack(chunk) for chunk in read_chunks(f, struct_len)]
Update
You don't, in fact, need to explicitly define a helper function as shown above because you can use Python's built-in iter()
function to dynamically create the needed iterator object in the list comprehension itself like so:
from functools import partial
with open(filename, "rb") as f:
results = [struct_unpack(chunk) for chunk in iter(partial(f.read, struct_len), b'')]
Use the struct
module; you need to define the types in a string format documented with that library:
struct.unpack('=HHf255s', bytes)
The above example expects native byte-order, two unsigned shorts, a float and a string of 255 characters.
To loop over an already fully read bytes
string, I'd use itertools
; there is a handy grouper recipe that I've adapter here:
from itertools import izip_longest, imap
from struct import unpack, calcsize
fmt_s = '=5i'
fmt_spec = '=256i'
size_s = calcsize(fmt_s)
size = size_s + calcsize(fmt_spec)
def chunked(iterable, n, fillvalue=''):
args = [iter(iterable)] * n
return imap(''.join, izip_longest(*args, fillvalue=fillvalue))
data = [unpack(fmt_s, section[:size_s]) + (unpack(fmt_spec, section[size_s:]),)
for section in chunked(bytes, size)]
This produces tuples rather than lists, but it's easy enough to adjust if you have to:
data = [list(unpack(fmt_s, section[:size_s])) + [list(unpack(fmt_spec, section[size_s:]))]
for section in chunked(bytes, size)]