Negating a backreference in Regular Expressions
Without knowing what you need the information for (or indeed even what language or tool you are using this regex in), there are many paths I can suggest.
Using these strings:
value = "hello and good morning"
value = 'hola y buenos dias'
value = 'how can I say "goodbye" so soon?'
value = 'why didn\'t you say "hello" to me this morning?'
value = "Goodbye! Please don't forget to write!"
value = 'Goodbye! Please don\'t forget to write!'
this expression:
"((\\"|[^"])*)"|'((\\'|[^'])*)'
will match these strings:
"hello and good morning"
'hola y buenos dias'
'how can I say "goodbye" so soon?'
'why didn\'t you say "hello" to me this morning?'
"Goodbye! Please don't forget to write!"
'Goodbye! Please don\'t forget to write!'
It would allow either the "other" type of quote or the same type of quote, when escaped with a single preceding \
. The contents of the quoted strings are either in group 1 or 3. You could figure out which type of quotes are used by getting the first (or last) character.
If you need some of these things to be in particular match groups, please give more specific examples (and include things that should not work, but look like they might be close)
Please ask if you would like to take this route and need a little more help
You can use:
\bvalue\s*=\s*(['"])(.*?)\1
See it
Instead of a negated character class, you have to use a negative lookahead:
\bvalue\s*=\s*(["'])(?:(?!\1).)*\1
(?:(?!\1).)*
consumes one character at a time, after the lookahead has confirmed that the character is not whatever was matched by the capturing group, (["''])
. A character class, negated or not, can only match one character at a time. As far as the regex engine knows, \1
could represent any number of characters, and there's no way to convince it that \1
will only contain "
or '
in this case. So you have to go with the more general (and less readable) solution.