Match last occurrence with regex

For me the clearest way is:

>>> re.findall('<br>(.*?)<br>', text)[-1]
'Tizi Ouzou'

A non regex approach using the builtin str functions:

text = """
Pellentesque habitant morbi tristique senectus et netus et
lesuada fames ac turpis egestas. Vestibulum tortor quam, feugiat vitae
ultricies eget, tempor sit amet, ante. Donec eu libero sit amet quam
egestas <br>semper<br>tizi ouzou<br>Tizi Ouzou<br>       """

res = text.rsplit('<br>', 2)[-2]
#Tizi Ouzou

Have a look at the related questions: you shouldn't parse HTML with regex. Use a regex parser instead. For Python, I hear Beautiful Soup is the way to go.

Anyway, if you want to do it with regex, you need to make sure that .* cannot go past another <br>. To do that, before consuming each character we can use a lookahead to make sure that it doesn't start another <br>:

<br>(?:(?!<br>).)*<br>\s*$

Match last occurrence with regex

Tags:

Python

Regex

Related

Recent Posts