Python regex match whole string only

You can use \Z:

\Z

Matches only at the end of the string.

In [5]: re.match(r'\w+\Z', 'foo\n')

In [6]: re.match(r'\w+\Z', 'foo')
Out[6]: <_sre.SRE_Match object; span=(0, 3), match='foo'>

To test whether you matched the entire string, just check if the matched string is as long as the entire string:

m = re.match(r".*", mystring)
start, stop = m.span()
if stop-start == len(mystring):
    print("The entire string matched")

Note: This is independent of the question (which you didn't ask) of how to match a trailing newline.


You can use a negative lookahead assertion to require that the $ is not followed by a trailing newline:

>>> re.match(r'\w+$(?!\n)', 'foo\n')
>>> re.match(r'\w+$(?!\n)', 'foo')
<_sre.SRE_Match object; span=(0, 3), match='foo'>

re.MULTILINE is not relevant here; OP has it turned off and the regex is still matching. The problem is that $ always matches right before the trailing newline:

When [re.MULTILINE is] specified, the pattern character '^' matches at the beginning of the string and at the beginning of each line (immediately following each newline); and the pattern character '$' matches at the end of the string and at the end of each line (immediately preceding each newline). By default, '^' matches only at the beginning of the string, and '$' only at the end of the string and immediately before the newline (if any) at the end of the string.

I have experimentally verified that this works correctly with re.X enabled.

Tags:

Python

Regex