Remove non-ASCII characters from pandas column

A common trick is to perform ASCII encoding with the errors="ignore" flag, then subsequently decoding it into ASCII:

df['DB_user'].str.encode('ascii', 'ignore').str.decode('ascii')

From python3.x and above, this is my recommended solution.


Minimal Code Sample

s = pd.Series(['Déjà vu', 'Ò|zz', ';test 123'])
s

0      Déjà vu
1         Ò|zz
2    ;test 123
dtype: object


s.str.encode('ascii', 'ignore').str.decode('ascii')

0        Dj vu
1          |zz
2    ;test 123
dtype: object

P.S.: This can also be extended to cases where you need to filter out characters that do not belong to any character encoding scheme (not just ASCII).


you may try this:

df.DB_user.replace({r'[^\x00-\x7F]+':''}, regex=True, inplace=True)