PHP/regex: How to get the string value of HTML tag?

Since attribute values may contain a plain > character, try this regular expression:

$pattern = '/<'.preg_quote($tagname, '/').'(?:[^"'>]*|"[^"]*"|\'[^\']*\')*>(.*?)<\/'.preg_quote($tagname, '/').'>/s';

But regular expressions are not suitable for parsing non-regular languages like HTML. You should better use a parser like SimpleXML or DOMDocument.


Try this

$str = '<option value="123">abc</option>
        <option value="123">aabbcc</option>';

preg_match_all("#<option.*?>([^<]+)</option>#", $str, $foo);

print_r($foo[1]);

<?php
function getTextBetweenTags($string, $tagname) {
    $pattern = "/<$tagname ?.*>(.*)<\/$tagname>/";
    preg_match($pattern, $string, $matches);
    return $matches[1];
}

$str = '<textformat leading="2"><p align="left"><font size="10">get me</font></p></textformat>';
$txt = getTextBetweenTags($str, "font");
echo $txt;
?>

That should do the trick


In your pattern, you simply want to match all text between the two tags. Thus, you could use for example a [\w\W] to match all characters.

function getTextBetweenTags($string, $tagname) {
    $pattern = "/<$tagname>([\w\W]*?)<\/$tagname>/";
    preg_match($pattern, $string, $matches);
    return $matches[1];
}

Tags:

Html

Php

Regex