Replace multiple substrings in a Pandas series with a value

You can perform this task by forming a |-separated string. This works because pd.Series.str.replace accepts regex:

Replace occurrences of pattern/regex in the Series/Index with some other string. Equivalent to str.replace() or re.sub().

This avoids the need to create a dictionary.

import pandas as pd

df = pd.DataFrame({'A': ['LOCAL TEST', 'TEST FOREIGN', 'ANOTHER HELLO', 'NOTHING']})

pattern = '|'.join(['LOCAL', 'FOREIGN', 'HELLO'])

df['A'] = df['A'].str.replace(pattern, 'CORP')

#               A
# 0     CORP TEST
# 1     TEST CORP
# 2  ANOTHER CORP
# 3       NOTHING

replace can accept dict , os we just create a dict for those values need to be replaced

dataUS['sec_type'].str.strip().replace(dict(zip(["LOCAL", "FOREIGN", "HELLO"], ["CORP"]*3)),regex=True)

Info of the dict

dict(zip(["LOCAL", "FOREIGN", "HELLO"], ["CORP"]*3))
Out[585]: {'FOREIGN': 'CORP', 'HELLO': 'CORP', 'LOCAL': 'CORP'}

The reason why you receive the error ,

str.replace is different from replace

Replace multiple substrings in a Pandas series with a value

Tags:

Python

Pandas

String

Python 2.7

Series

Related

Recent Posts