Check if a string is html or not
A better regex to use to check if a string is HTML is:
/^/
For example:
/^/.test('') // true
/^/.test('foo bar baz') //true
/^/.test('<p>fizz buzz</p>') //true
In fact, it's so good, that it'll return true
for every string passed to it, which is because every string is HTML. Seriously, even if it's poorly formatted or invalid, it's still HTML.
If what you're looking for is the presence of HTML elements, rather than simply any text content, you could use something along the lines of:
/<\/?[a-z][\s\S]*>/i.test()
It won't help you parse the HTML in any way, but it will certainly flag the string as containing HTML elements.
Method #1. Here is the simple function to test if the string contains HTML data:
function isHTML(str) {
var a = document.createElement('div');
a.innerHTML = str;
for (var c = a.childNodes, i = c.length; i--; ) {
if (c[i].nodeType == 1) return true;
}
return false;
}
The idea is to allow browser DOM parser to decide if provided string looks like an HTML or not. As you can see it simply checks for ELEMENT_NODE
(nodeType
of 1).
I made a couple of tests and looks like it works:
isHTML('<a>this is a string</a>') // true
isHTML('this is a string') // false
isHTML('this is a <b>string</b>') // true
This solution will properly detect HTML string, however it has side effect that img/vide/etc. tags will start downloading resource once parsed in innerHTML.
Method #2. Another method uses DOMParser and doesn't have loading resources side effects:
function isHTML(str) {
var doc = new DOMParser().parseFromString(str, "text/html");
return Array.from(doc.body.childNodes).some(node => node.nodeType === 1);
}
Notes:
1. Array.from
is ES2015 method, can be replaced with [].slice.call(doc.body.childNodes)
.
2. Arrow function in some
call can be replaced with usual anonymous function.