How to do virtual file processing?
You have StringIO
and BytesIO
in the io
module.
StringIO
behaves like a file opened in text mode - reading and writing unicode strings (equivalent to opening a file with io.open(filename, mode, encoding='...')
), and the BytesIO
behaves like a file opened in binary mode (mode='[rw]b'
), and can read write bytes.
Python 2:
In [4]: f = io.BytesIO('test')
In [5]: type(f.read())
Out[5]: str
In [6]: f = io.StringIO(u'test')
In [7]: type(f.read())
Out[7]: unicode
Python 3:
In [2]: f = io.BytesIO(b'test')
In [3]: type(f.read())
Out[3]: builtins.bytes
In [4]: f = io.StringIO('test')
In [5]: type(f.read())
Out[5]: builtins.str
You can use StringIO as a virtual file , from official documentation
from io import StringIO
output = StringIO()
output.write('First line.\n')
print >>output, 'Second line.'
# Retrieve file contents -- this will be
# 'First line.\nSecond line.\n'
contents = output.getvalue()
# Close object and discard memory buffer --
# .getvalue() will now raise an exception.
output.close()
You might want to consider using a tempfile.SpooledTemporaryFile
which gives you the best of both worlds in the sense that it will create a temporary memory-based virtual file initially but will automatically switch to a physical disk-based file if the data held in memory exceeds a specified size.
Another nice feature is that (when using memory) it will automatically use either an io.BytesIO
or io.StringIO
depending on what mode
is being used—allowing you to either read and write Unicode strings or binary data (bytes) to it.
The only tricky part might be the fact that you'll need to avoid closing the file between steps because doing so would cause it to be deleted from memory or disk. Instead you can just rewind it back to the beginning with a file seek(0)
method call.
When you are completely done with the file and close it, it will automatically be deleted from disk if the amount of data in it caused it to be rolled-over to a physical file.