How to remove multiple columns that end with same text in Pandas?
using filter
and regex
df.filter(regex=r'^((?!prefix).)*$')
Demo
df = pd.DataFrame(np.random.rand(2, 6),
columns=['oneprefix', 'one',
'twoprefix', 'two',
'threeprefix', 'three'])
df.filter(regex=r'^((?!prefix).)*$')
where:
df
Timing
All are about the same
df2 = df.loc[:, ~df.columns.str.endswith('prefix')]
for the sake of completeness:
In [306]: df
Out[306]:
prefixcol1 col2prefix col3prefix colN
0 1 1 1 1
1 2 2 2 2
2 3 3 3 3
In [307]: df.loc[:, ~df.columns.str.contains('prefix$')]
Out[307]:
prefixcol1 colN
0 1 1
1 2 2
2 3 3
or another variant:
In [388]: df.select(lambda x: re.search(r'prefix$', str(x)) is None, axis=1)
Out[388]:
prefixcol1 colN
0 1 1
1 2 2
2 3 3
df2 = df.drop([col for col in df.columns if 'prefix' in col],axis=1)