Read specific columns with pandas or other python module

An easy way to do this is using the pandas library like this.

import pandas as pd
fields = ['star_name', 'ra']

df = pd.read_csv('data.csv', skipinitialspace=True, usecols=fields)
# See the keys
print df.keys()
# See content in 'star_name'
print df.star_name

The problem here was the skipinitialspace which remove the spaces in the header. So ' star_name' becomes 'star_name'


Above answers are in python2. So for python 3 users I am giving this answer. You can use the bellow code:

import pandas as pd
fields = ['star_name', 'ra']

df = pd.read_csv('data.csv', skipinitialspace=True, usecols=fields)
# See the keys
print(df.keys())
# See content in 'star_name'
print(df.star_name)

According to the latest pandas documentation you can read a csv file selecting only the columns which you want to read.

import pandas as pd

df = pd.read_csv('some_data.csv', usecols = ['col1','col2'], low_memory = True)

Here we use usecols which reads only selected columns in a dataframe.

We are using low_memory so that we Internally process the file in chunks.

Tags:

Python

Pandas

Csv