matching any character including newlines in a Python regex subexpression, not globally

To match a newline, or "any symbol" without re.S/re.DOTALL, you may use any of the following:

[\s\S]
[\w\W]
[\d\D]

The main idea is that the opposite shorthand classes inside a character class match any symbol there is in the input string.

Comparing it to (.|\s) and other variations with alternation, the character class solution is much more efficient as it involves much less backtracking (when used with a * or + quantifier). Compare the small example: it takes (?:.|\n)+ 45 steps to complete, and it takes [\s\S]+ just 2 steps.

Tags:

Python

Regex