Get column name where value is something in pandas dataframe
Here is one, perhaps inelegant, way to do it:
df_result = pd.DataFrame(ts, columns=['value'])
Set up a function which grabs the column name which contains the value (from ts
):
def get_col_name(row):
b = (df.ix[row.name] == row['value'])
return b.index[b.argmax()]
for each row, test which elements equal the value, and extract column name of a True.
And apply
it (row-wise):
In [3]: df_result.apply(get_col_name, axis=1)
Out[3]:
1979-01-01 00:00:00 col5
1979-01-01 06:00:00 col3
1979-01-01 12:00:00 col1
1979-01-01 18:00:00 col1
i.e. use df_result['Column'] = df_result.apply(get_col_name, axis=1)
.
.
Note: there is quite a lot going on in get_col_name
so perhaps it warrants some further explanation:
In [4]: row = df_result.irow(0) # an example row to pass to get_col_name
In [5]: row
Out[5]:
value 1181.220328
Name: 1979-01-01 00:00:00
In [6]: row.name # use to get rows of df
Out[6]: <Timestamp: 1979-01-01 00:00:00>
In [7]: df.ix[row.name]
Out[7]:
col5 1181.220328
col4 912.154923
col3 648.848635
col2 390.986156
col1 138.185861
Name: 1979-01-01 00:00:00
In [8]: b = (df.ix[row.name] == row['value'])
#checks whether each elements equal row['value'] = 1181.220328
In [9]: b
Out[9]:
col5 True
col4 False
col3 False
col2 False
col1 False
Name: 1979-01-01 00:00:00
In [10]: b.argmax() # index of a True value
Out[10]: 0
In [11]: b.index[b.argmax()] # the index value (column name)
Out[11]: 'col5'
It might be there is more efficient way to do this...
Just wanted to add that for a situation where multiple columns may have the value and you want all the column names in a list, you can do the following (e.g. get all column names with a value = 'x'):
df.apply(lambda row: row[row == 'x'].index, axis=1)
The idea is that you turn each row into a series (by adding axis=1
) where the column names are now turned into the index of the series. You then filter your series with a condition (e.g. row == 'x'
), then take the index values (aka column names!).