Setting smaller buffer size for sys.stdin?
You can completely remove buffering from stdin/stdout by using python's -u
flag:
-u : unbuffered binary stdout and stderr (also PYTHONUNBUFFERED=x)
see man page for details on internal buffering relating to '-u'
and the man page clarifies:
-u Force stdin, stdout and stderr to be totally unbuffered. On
systems where it matters, also put stdin, stdout and stderr in
binary mode. Note that there is internal buffering in xread-
lines(), readlines() and file-object iterators ("for line in
sys.stdin") which is not influenced by this option. To work
around this, you will want to use "sys.stdin.readline()" inside
a "while 1:" loop.
Beyond this, altering the buffering for an existing file is not supported, but you can make a new file object with the same underlying file descriptor as an existing one, and possibly different buffering, using os.fdopen. I.e.,
import os
import sys
newin = os.fdopen(sys.stdin.fileno(), 'r', 100)
should bind newin
to the name of a file object that reads the same FD as standard input, but buffered by only about 100 bytes at a time (and you could continue with sys.stdin = newin
to use the new file object as standard input from there onwards). I say "should" because this area used to have a number of bugs and issues on some platforms (it's pretty hard functionality to provide cross-platform with full generality) -- I'm not sure what its state is now, but I'd definitely recommend thorough testing on all platforms of interest to ensure that everything goes smoothly. (-u
, removing buffering entirely, should work with fewer problems across all platforms, if that might meet your requirements).
This worked for me in Python 3.4.3:
import os
import sys
unbuffered_stdin = os.fdopen(sys.stdin.fileno(), 'rb', buffering=0)
The documentation for fdopen()
says it is just an alias for open()
.
open()
has an optional buffering
parameter:
buffering is an optional integer used to set the buffering policy. Pass 0 to switch buffering off (only allowed in binary mode), 1 to select line buffering (only usable in text mode), and an integer > 1 to indicate the size in bytes of a fixed-size chunk buffer.
In other words:
- Fully unbuffered stdin requires binary mode and passing zero as the buffer size.
- Line-buffering requires text mode.
- Any other buffer size seems to work in both binary and text modes (according to the documentation).
The sys.stdin.__iter__
still being line-buffered, one can have an iterator that behaves mostly identically (stops at EOF, whereas stdin.__iter__
won't) by using the 2-argument form of iter
to make an iterator of sys.stdin.readline
:
import sys
for line in iter(sys.stdin.readline, ''):
sys.stdout.write('> ' + line.upper())
Or provide None
as the sentinel (but note that then you need to handle the EOF condition yourself).
You can simply use sys.stdin.readline()
instead of sys.stdin.__iter__()
:
import sys
while True:
line = sys.stdin.readline()
if not line: break # EOF
sys.stdout.write('> ' + line.upper())
This gives me line-buffered reads using Python 2.7.4 and Python 3.3.1 on Ubuntu 13.04.