fast way to read from StringIO until some byte is encountered
I very disappointed that this question get only one answer on stack overflow, because it is interesting and relevant question. Anyway, since only ovgolovin give solution and I thinked it is maybe slow, I thought a faster solution:
def foo(stringio):
datalist = []
while True:
chunk = stringio.read(256)
i = chunk.find('Z')
if i == -1:
datalist.append(chunk)
else:
datalist.append(chunk[:i+1])
break
if len(chunk) < 256:
break
return ''.join(datalist)
This read io in chunks (maybe end char found not in first chunk). It is very fast because no Python function called for each character, but on the contrary maximal usage of C-written Python functions.
This run about 60x faster than ovgolovin's solution. I ran timeit
to check it.
i = iter(lambda: stringio.read(1),'Z')
buf = ''.join(i) + 'Z'
Here iter
is used in this mode: iter(callable, sentinel) -> iterator
.
''.join(...)
is quite effective. The last operation of adding 'Z' ''.join(i) + 'Z'
is not that good. But it can be addressed by adding 'Z'
to the iterator:
from itertools import chain, repeat
stringio = StringIO.StringIO('ABCZ123')
i = iter(lambda: stringio.read(1),'Z')
i = chain(i,repeat('Z',1))
buf = ''.join(i)
One more way to do it is to use generator:
def take_until_included(stringio):
while True:
s = stringio.read(1)
yield s
if s=='Z':
return
i = take_until_included(stringio)
buf = ''.join(i)
I made some efficiency tests. The performance of the described techniques is pretty the same:
http://ideone.com/dQGe5