How to remove symbols from a string with Python?

Sometimes it takes longer to figure out the regex than to just write it out in python:

import string
s = "how much for the maple syrup? $20.99? That's ricidulous!!!"
for char in string.punctuation:
    s = s.replace(char, ' ')

If you need other characters you can change it to use a white-list or extend your black-list.

Sample white-list:

whitelist = string.letters + string.digits + ' '
new_s = ''
for char in s:
    if char in whitelist:
        new_s += char
    else:
        new_s += ' '

Sample white-list using a generator-expression:

whitelist = string.letters + string.digits + ' '
new_s = ''.join(c for c in s if c in whitelist)

One way, using regular expressions:

>>> s = "how much for the maple syrup? $20.99? That's ridiculous!!!"
>>> re.sub(r'[^\w]', ' ', s)
'how much for the maple syrup   20 99  That s ridiculous   '
  • \w will match alphanumeric characters and underscores

  • [^\w] will match anything that's not alphanumeric or underscore