fast way to remove lowercase substrings from string?
import re
remove_lower = lambda text: re.sub('[a-z]', '', text)
s = "FOObarFOOObBAR"
s = remove_lower(s)
print(s)
My first approach would be ''.join(x for x in s if not x.islower())
If you need speed use mgilson answer, it is a lot faster.
>>> timeit.timeit("''.join(x for x in 'FOOBarBaz' if not x.islower())")
3.318969964981079
>>> timeit.timeit("'FOOBarBaz'.translate(None, string.ascii_lowercase)", "import string")
0.5369198322296143
>>> timeit.timeit("re.sub('[a-z]', '', 'FOOBarBaz')", "import re")
3.631659984588623
>>> timeit.timeit("r.sub('', 'FOOBarBaz')", "import re; r = re.compile('[a-z]')")
1.9642360210418701
>>> timeit.timeit("''.join(x for x in 'FOOBarBaz' if x not in lowercase)", "lowercase = set('abcdefghijklmnopqrstuvwxyz')")
2.9605889320373535
Python3.x answer:
You can make a string translation table. Once that translation table has been created, you can use it repeatedly:
>>> import string
>>> table = str.maketrans('', '', string.ascii_lowercase)
>>> s = 'FOObarFOOObBAR'
>>> s.translate(table)
'FOOFOOOBAR'
When used this way, the first argument values map to the second argument values (where present). If absent, it is assumed to be an identity mapping. The third argument is the collection of values to be removed.
Old python2.x answer for anyone who cares:
I'd use str.translate
. Only the delete step is performed if you pass None
for the translation table. In this case, I pass the ascii_lowercase
as the letters to be deleted.
>>> import string
>>> s = 'FOObarFOOObBAR'
>>> s.translate(None, string.ascii_lowercase)
'FOOFOOOBAR'
I doubt you'll find a faster way, but there's always timeit
to compare different options if someone is motivated :).