Better way to remove multiple words from a string?
I use
bannedWord = ['Good','Bad','Ugly']
toPrint = 'Hello Ugly Guy, Good To See You.'
print ' '.join(i for i in toPrint.split() if i not in bannedWord)
Here's a solution with regex:
import re
def RemoveBannedWords(toPrint,database):
statement = toPrint
pattern = re.compile("\\b(Good|Bad|Ugly)\\W", re.I)
return pattern.sub("", toPrint)
toPrint = 'Hello Ugly Guy, Good To See You.'
print RemoveBannedWords(toPrint,bannedWord)
Slight variation on Ajay's code, when one of the string is a substring of other in the bannedWord list
bannedWord = ['good', 'bad', 'good guy' 'ugly']
The result of toPrint ='good winter good guy'
would be
RemoveBannedWords(toPrint,database = bannedWord) = 'winter good'
as it will remove good
first. A sorting is required wrt length of elements in the list.
import re
def RemoveBannedWords(toPrint,database):
statement = toPrint
database_1 = sorted(list(database), key=len)
pattern = re.compile(r"\b(" + "|".join(database_1) + ")\\W", re.I)
return pattern.sub("", toPrint + ' ')[:-1] #added because it skipped last word
toPrint = 'good winter good guy.'
print(RemoveBannedWords(toPrint,bannedWord))