read file into array separated by paragraph Python
I know this question was asked long before but just putting my inputs so that it will be useful to somebody else at some point of time. I got to know much easier way to split the input file into paragraphs based on the Paragraph Separator(it can be a \n or a blank space or anything else) and the code snippet for your question is given below :
with open("input.txt", "r") as input:
input_ = input.read().split("\n\n") #\n\n denotes there is a blank line in between paragraphs.
And after executing this command, if you try to print input_[0] it will show the first paragraph, input_[1] will show the second paragraph and so on. So it is putting all the paragraphs present in the input file into an List with each List element contains a paragraph from the input file.
import itertools as it
def paragraphs(fileobj, separator='\n'):
"""Iterate a fileobject by paragraph"""
## Makes no assumptions about the encoding used in the file
lines = []
for line in fileobj:
if line == separator and lines:
yield ''.join(lines)
lines = []
else:
lines.append(line)
yield ''.join(lines)
paragraph_lists = [[], [], []]
with open('/Users/robdev/Desktop/test.txt') as f:
paras = paragraphs(f)
for para, group in it.izip(paras, it.cycle(paragraph_lists)):
group.append(para)
print paragraph_lists
This is the basic code I would try:
f = open('data.txt', 'r')
data = f.read()
array1 = []
array2 = []
array3 = []
splat = data.split("\n\n")
for number, paragraph in enumerate(splat, 1):
if number % 3 == 1:
array1 += [paragraph]
elif number % 3 == 2:
array2 += [paragraph]
elif number % 3 == 0:
array3 += [paragraph]
This should be enough to get you started. If the paragraphs in the file are split by two new lines then "\n\n" should do the trick for splitting them.