Use regex to find specific string not in html tag

When your regex processor doesn't support variable length look behind, try this:

(<.+?>[^<>]*?)(_mystring_)([^<>]*?<.+?>)

Preserve capture groups 1 and 3 and replace capture group 2:

For example, in Eclipse, find:

(<.+?>[^<>]*?)(_mystring_)([^<>]*?<.+?>)

and replace with:

$1_newString_$3

(Other regex processors might use a different capture group syntax, such as \1)


Another regex to search that worked for me

(?![^<]*>)_mystring_

Source: https://stackoverflow.com/a/857819/1106878


A quick and dirty alternative is to use a regex replace function with callback to encode the content of tags (everything between < and >), for example using base64, then run your search, then run another callback to decode your tag contents.

This can also save a lot of head scratching when you need to exclude specific tags from a regex search - first obfuscate them and wrap them in a marker that won't match your search, then run your search, then deobfuscate whatever is in markers.


This should do it:

(?<!<[^>]*)_mystring_

It uses a negative look behind to check that the matched string does not have a < before it without a corresponding >

Tags:

Html

.Net

Regex