How can I remove all strings that fit certain format from a list?
You can use regular expression \d+(?::\d+)?$
and filter using it.
See demo.
https://regex101.com/r/HoGZYh/1
import re
a = ['abd', ' the dog', '4:45', '1234 total', '123', '6:31']
print [i for i in a if not re.match(r"\d+(?::\d+)?$", i)]
Output: ['abd', ' the dog', '1234 total']
Consider using the built-in filter
function with a compiled regex.
>>> import re
>>> no_times = re.compile(r'^(?!\d\d?:\d\d(\s*[AP]M)?$).*$')
>>> a = ['abd', ' the dog', '4:45 AM', '1234 total', 'etc...','6:31 PM', '2:36']
>>> filter(no_times.match, a)
['abd', ' the dog', '1234 total', 'etc...']
A lambda can also be used for the first argument if, for example, you wanted to avoid compiling a regex, though it is messier.
>>> filter(lambda s: not re.match(r'^\d\d?:\d\d(\s*[AP]M)?$', s), a)
['abd', ' the dog', '1234 total', 'etc...']
Note that in Python 3, filter
returns an iterable object instead of a list.
The regular expression here works by accepting all strings except \d\d?:\d\d(\s*[AP]M)?$
. This means all strings except for ones matching HH:MM
, optionally ending in some whitespace followed by AM or PM.
Try this code in pure Python. Firstly it checks the last two chars, if the last two chars equals to 'am' or 'pm', element should be removed from list. Secondly it checks each element if it contains ':', if ':' is found in the element, then it checks the characters before and after ':'. If characters before and after ':' are digits, the element is removed from list. The idea supports number|number:number and number:number|number.
def removeElements(a):
removed_elements = []
L = len(a)
for i in range(L):
element = a[i]
if 'am' == element[-2:].lower() or 'pm' ==element[-2:].lower() :
removed_elements.append(element)
if ':' in element:
part1 = element.split(':')
part2 = element.split(':')
if part1[-1].isdigit() and part2[0].isdigit():
removed_elements.append(element)
output = []
for element in a:
if not(element in removed_elements):
output.append(element)
return output
a = ['abd', ' the dog', '4:45 AM', '1234 total', 'etc...','6:31 PM', '2:36']
output = removeElements(a)
print output
output for this example is : ['abd', ' the dog', '1234 total', 'etc...']