Regular expression for anchor tag with all attributes
/<a[^>]*>([^<]+)<\/a>/g
It's far from being perfect, but you need to provide more examples of what is a correct match and what isn't (e.g. what about whitespaces?)
/<a[\s]+([^>]+)>((?:.(?!\<\/a\>))*.)<\/a>/g
This one will match any <a ...>...</a>
tag including correctly matching ones that contain a < or any full tags such as:
blah blah <a href="test.html">This line contains an HTML opening < bracket.</a> blah blah
blah blah <a href="test.html">This line contains <strong>bold</strong> text.</a> blah blah
Would capture:
<a href="test.html">This line contains an HTML opening < bracket.</a>
- with capture groups:
href="test.html"
This line contains an HTML opening < bracket.
and
<a href="test.html">This line contains <strong>bold</strong> text.</a>
- with capture groups:
href="test.html"
This line contains <strong>bold</strong> text.
It also includes capturing groups for the tag attributes (like class="", href="", etc) and contain (what is between the tag) that can be removed if you do not need them.
If you want to capture across multiple lines add an "s" before or after the "g" flag at the end. Note that the "s" flag may not work in all flavors of regular expression.
Capture example (not using the "s" flag - not supported by regexr yet): http://regexr.com/39rsv
Just a little correction from the accepted answer. This is the correct regex: /<a[^>]*>([^<]+)<\/a>/g
. The forward slash (/)
for closing the anchor tag </a>
was not escaped so no match will be made.