Filter pandas dataframe with specific column names in python

You can just put mylist inside [] and pandas will select it for you.

mydata_new = mydata[mylist]

Not sure whether your yyy is a typo.

The reason that you are wrong is that you are assigning mydata_new to a new series every time in the loop.

for item in mylist:
    mydata_new = mydata[item]  # <-  

Thus, it will create a series rather than the whole df you want.


If some names in the list is not in your data frame, you can always check it with,

len(set(mylist) - set(mydata.columns)) > 0

and print it out

print(set(mylist) - set(mydata.columns))

Then see if there are typos or other unintended behaviors.


Just pass a list of column names to index df:

df[['nnn', 'mmm', 'yyy']]

   nnn  mmm  yyy
0    5    5   10
1    3    4    9
2    7    0    8

If you need to handle non-existent column names in your list, try filtering with df.columns.isin -

df.loc[:, df.columns.isin(['nnn', 'mmm', 'yyy', 'zzzzzz'])]

   yyy  nnn  mmm
0   10    5    5
1    9    3    4
2    8    7    0