How to delete the last column of data of a pandas dataframe
You can specify which columns to import using usecols
parameter for read_csv
So either create a list of column names or integer values:
cols_to_use = ['col1', 'col2'] # or [0,1,2,3]
df = pd.read_csv('mycsv.csv', usecols= cols_to_use)
or drop the column after importing, I prefer the former method (why import data you are not interested in?).
df = df.drop(labels='column_to_delete', axis=1) # axis 1 drops columns, 0 will drop rows that match index value in labels
Note also you misunderstand what tail
does, it returns the last n
rows (default is 5) of a dataframe.
Additional
If the columns are varying length then you can just the header to get the columns and then read the csv again properly and drop the last column:
def df_from_csv(path):
df = read_csv(path, nrows=1) # read just first line for columns
columns = df.columns.tolist() # get the columns
cols_to_use = columns[:len(columns)-1] # drop the last one
df = read_csv(path, usecols=cols_to_use)
return df
Another method to delete last column in DataFrame df:
df = df.iloc[:, :-1]
Here's a one-liner that does not require specifying the column name
df.drop(df.columns[len(df.columns)-1], axis=1, inplace=True)