Using regular expressions to find img tags without an alt attribute
Here is what I just tried in my own environment with a massive enterprise code base with some good success (found no false positives but definitely found valid cases):
<img(?![^>]*\balt=)[^>]*?>
What's going on in this search:
- find the opening of the tag
- look for the absence of zero or more characters that are not the closing bracket while also …
- Checking for the absence of of a word that begins with "alt" ("\b" is there for making sure we don't get a mid-word name match on something like a class value) and is followed by "=", then …
- look for zero or more characters that are not the closing bracket
- find the closing bracket
So this will match:
<img src="foo.jpg" class="baltic" />
But it won't match either of these:
<img src="foo.jpg" class="baltic" alt="" />
<img src="foo.jpg" alt="I have a value.">
This works in Eclipse:
<img(?!.*alt).*?>
I'm updating for Section 508 too!
Building on Mr.Black and Roberts126 answers:
/(<img(?!.*?alt=(['"]).*?\2)[^>]*)(>)/
This will match an img tag anywhere in the code which either has no alt tag or an alt tag which is not followed by ="" or ='' (i.e. invalid alt tags).
Breaking it down:
( : open capturing group
<img : match the opening of an img tag
(?! : open negative look-ahead
.*? : lazy some or none to match any character
alt=(['"]) : match an 'alt' attribute followed by ' or " (and remember which for later)
.*? : lazy some or none to match the value of the 'alt' attribute
\2) : back-reference to the ' or " matched earlier
[^>]* : match anything following the alt tag up to the closing '>' of the img tag
) : close capturing group
(>) : match the closing '>' of the img tag
If your code editor allows search and replace by Regex you can use this in combination with the replace string:
$1 alt=""$3
To find any alt-less img tags and append them with an empty alt tag. This is useful when using spacers or other layout images for HTML emails and the like.