How do I read and write CSV files with Python?
If needed- read a csv file without using the csv module:
rows = []
with open('test.csv') as f:
for line in f:
# strip whitespace
line = line.strip()
# separate the columns
line = line.split(',')
# save the line for use later
rows.append(line)
Here are some minimal complete examples how to read CSV files and how to write CSV files with Python.
Python 3: Reading a CSV file
Pure Python
import csv
# Define data
data = [
(1, "A towel,", 1.0),
(42, " it says, ", 2.0),
(1337, "is about the most ", -1),
(0, "massively useful thing ", 123),
(-2, "an interstellar hitchhiker can have.", 3),
]
# Write CSV file
with open("test.csv", "wt") as fp:
writer = csv.writer(fp, delimiter=",")
# writer.writerow(["your", "header", "foo"]) # write header
writer.writerows(data)
# Read CSV file
with open("test.csv") as fp:
reader = csv.reader(fp, delimiter=",", quotechar='"')
# next(reader, None) # skip the headers
data_read = [row for row in reader]
print(data_read)
After that, the contents of data_read
are
[['1', 'A towel,', '1.0'],
['42', ' it says, ', '2.0'],
['1337', 'is about the most ', '-1'],
['0', 'massively useful thing ', '123'],
['-2', 'an interstellar hitchhiker can have.', '3']]
Please note that CSV reads only strings. You need to convert to the column types manually.
A Python 2+3 version was here before (link), but Python 2 support is dropped. Removing the Python 2 stuff massively simplified this answer.
Related
- How do I write data into csv format as string (not file)?
- How can I use io.StringIO() with the csv module?: This is interesting if you want to serve a CSV on-the-fly with Flask, without actually storing the CSV on the server.
mpu
Have a look at my utility package mpu
for a super simple and easy to remember one:
import mpu.io
data = mpu.io.read('example.csv', delimiter=',', quotechar='"', skiprows=None)
mpu.io.write('example.csv', data)
Pandas
import pandas as pd
# Read the CSV into a pandas data frame (df)
# With a df you can do many things
# most important: visualize data with Seaborn
df = pd.read_csv('myfile.csv', sep=',')
print(df)
# Or export it in many ways, e.g. a list of tuples
tuples = [tuple(x) for x in df.values]
# or export it as a list of dicts
dicts = df.to_dict().values()
See read_csv
docs for more information. Please note that pandas automatically infers if there is a header line, but you can set it manually, too.
If you haven't heard of Seaborn, I recommend having a look at it.
Other
Reading CSV files is supported by a bunch of other libraries, for example:
dask.dataframe.read_csv
spark.read.csv
Created CSV file
1,"A towel,",1.0
42," it says, ",2.0
1337,is about the most ,-1
0,massively useful thing ,123
-2,an interstellar hitchhiker can have.,3
Common file endings
.csv
Working with the data
After reading the CSV file to a list of tuples / dicts or a Pandas dataframe, it is simply working with this kind of data. Nothing CSV specific.
Alternatives
- JSON: Nice for writing human-readable data; VERY commonly used (read & write)
- CSV: Super simple format (read & write)
- YAML: Nice to read, similar to JSON (read & write)
- pickle: A Python serialization format (read & write)
- MessagePack (Python package): More compact representation (read & write)
- HDF5 (Python package): Nice for matrices (read & write)
- XML: exists too *sigh* (read & write)
For your application, the following might be important:
- Support by other programming languages
- Reading / writing performance
- Compactness (file size)
See also: Comparison of data serialization formats
In case you are rather looking for a way to make configuration files, you might want to read my short article Configuration files in Python