How do I read two lines from a file at a time using python

import itertools
with open('a') as f:
    for line1,line2 in itertools.zip_longest(*[f]*2):
        print(line1,line2)

itertools.zip_longest() returns an iterator, so it'll work well even if the file is billions of lines long.

If there are an odd number of lines, then line2 is set to None on the last iteration.

On Python2 you need to use izip_longest instead.


In the comments, it has been asked if this solution reads the whole file first, and then iterates over the file a second time. I believe that it does not. The with open('a') as f line opens a file handle, but does not read the file. f is an iterator, so its contents are not read until requested. zip_longest takes iterators as arguments, and returns an iterator.

zip_longest is indeed fed the same iterator, f, twice. But what ends up happening is that next(f) is called on the first argument and then on the second argument. Since next() is being called on the same underlying iterator, successive lines are yielded. This is very different than reading in the whole file. Indeed the purpose of using iterators is precisely to avoid reading in the whole file.

I therefore believe the solution works as desired -- the file is only read once by the for-loop.

To corroborate this, I ran the zip_longest solution versus a solution using f.readlines(). I put a input() at the end to pause the scripts, and ran ps axuw on each:

% ps axuw | grep zip_longest_method.py

unutbu 11119 2.2 0.2 4520 2712 pts/0 S+ 21:14 0:00 python /home/unutbu/pybin/zip_longest_method.py bigfile

% ps axuw | grep readlines_method.py

unutbu 11317 6.5 8.8 93908 91680 pts/0 S+ 21:16 0:00 python /home/unutbu/pybin/readlines_method.py bigfile

The readlines clearly reads in the whole file at once. Since the zip_longest_method uses much less memory, I think it is safe to conclude it is not reading in the whole file at once.


Similar question here. You can't mix iteration and readline so you need to use one or the other.

while True:
    line1 = f.readline()
    line2 = f.readline()
    if not line2: break  # EOF
    ...

Tags:

Python