Writing large Pandas Dataframes to CSV file in chunks
Solution:
header = True
for chunk in chunks:
chunk.to_csv(os.path.join(folder, new_folder, "new_file_" + filename),
header=header, cols=[['TIME','STUFF']], mode='a')
header = False
Notes:
- The
mode='a'
tells pandas to append. - We only write a column header on the first chunk.
Check out the chunksize
argument in the to_csv
method. Here are the docs.
Writing to file would look like:
df.to_csv("path/to/save/file.csv", chunksize=1000, cols=['TIME','STUFF'])