How do I read a random line from one file?
import random
lines = open('file.txt').read().splitlines()
myline =random.choice(lines)
print(myline)
For very long file: seek to random place in file based on it's length and find two newline characters after position (or newline and end of file). Do again 100 characters before or from beginning of file if original seek position was <100 if we ended up inside the last line.
However this is over complicated, as file is iterator.So make it list and take random.choice (if you need many, use random.sample):
import random
print(random.choice(list(open('file.txt'))))
It depends what do you mean by "too much" overhead. If storing whole file in memory is possible, then something like
import random
random_lines = random.choice(open("file").readlines())
would do the trick.
Not built-in, but algorithm R(3.4.2)
(Waterman's "Reservoir Algorithm") from Knuth's "The Art of Computer Programming" is good (in a very simplified version):
import random
def random_line(afile):
line = next(afile)
for num, aline in enumerate(afile, 2):
if random.randrange(num):
continue
line = aline
return line
The num, ... in enumerate(..., 2)
iterator produces the sequence 2, 3, 4... The randrange
will therefore be 0 with a probability of 1.0/num
-- and that's the probability with which we must replace the currently selected line (the special-case of sample size 1 of the referenced algorithm -- see Knuth's book for proof of correctness == and of course we're also in the case of a small-enough "reservoir" to fit in memory ;-))... and exactly the probability with which we do so.