How to remove specific substrings from a set of strings in Python?
>>> x = 'Pear.good'
>>> y = x.replace('.good','')
>>> y
'Pear'
>>> x
'Pear.good'
.replace
doesn't change the string, it returns a copy of the string with the replacement. You can't change the string directly because strings are immutable.
You need to take the return values from x.replace
and put them in a new set.
In Python 3.9+ you could remove the suffix using str.removesuffix('mysuffix')
. From the docs:
If the string ends with the suffix string and that suffix is not empty, return
string[:-len(suffix)]
. Otherwise, return a copy of the original string
So you can either create a new empty set and add each element without the suffix to it:
set1 = {'Apple.good', 'Orange.good', 'Pear.bad', 'Pear.good', 'Banana.bad', 'Potato.bad'}
set2 = set()
for s in set1:
set2.add(s.removesuffix(".good").removesuffix(".bad"))
Or create the new set using a set comprehension:
set2 = {s.removesuffix(".good").removesuffix(".bad") for s in set1}
print(set2)
Output:
{'Orange', 'Pear', 'Apple', 'Banana', 'Potato'}
Strings are immutable. str.replace
creates a new string. This is stated in the documentation:
str.replace(old, new[, count])
Return a copy of the string with all occurrences of substring old replaced by new. [...]
This means you have to re-allocate the set or re-populate it (re-allocating is easier with a set comprehension):
new_set = {x.replace('.good', '').replace('.bad', '') for x in set1}
P.S. if you want to change the prefix or suffix of a string and you're using Python 3.9 or newer, use str.removeprefix()
or str.removesuffix()
instead:
new_set = {x.removesuffix('.good').removesuffix('.bad') for x in set1}