Read specific columns with pandas or other python module
An easy way to do this is using the pandas
library like this.
import pandas as pd
fields = ['star_name', 'ra']
df = pd.read_csv('data.csv', skipinitialspace=True, usecols=fields)
# See the keys
print df.keys()
# See content in 'star_name'
print df.star_name
The problem here was the skipinitialspace
which remove the spaces in the header. So ' star_name' becomes 'star_name'
Above answers are in python2. So for python 3 users I am giving this answer. You can use the bellow code:
import pandas as pd
fields = ['star_name', 'ra']
df = pd.read_csv('data.csv', skipinitialspace=True, usecols=fields)
# See the keys
print(df.keys())
# See content in 'star_name'
print(df.star_name)
According to the latest pandas documentation you can read a csv file selecting only the columns which you want to read.
import pandas as pd
df = pd.read_csv('some_data.csv', usecols = ['col1','col2'], low_memory = True)
Here we use usecols
which reads only selected columns in a dataframe.
We are using low_memory
so that we Internally process the file in chunks.