Check if a string is html or not

A better regex to use to check if a string is HTML is:

/^/

For example:

Click to copy

/^/.test('') // true
/^/.test('foo bar baz') //true
/^/.test('<p>fizz buzz</p>') //true

In fact, it's so good, that it'll return true for every string passed to it, which is because every string is HTML. Seriously, even if it's poorly formatted or invalid, it's still HTML.

If what you're looking for is the presence of HTML elements, rather than simply any text content, you could use something along the lines of:

Click to copy

/<\/?[a-z][\s\S]*>/i.test()

It won't help you parse the HTML in any way, but it will certainly flag the string as containing HTML elements.

Method #1. Here is the simple function to test if the string contains HTML data:

Click to copy

function isHTML(str) {
  var a = document.createElement('div');
  a.innerHTML = str;

  for (var c = a.childNodes, i = c.length; i--; ) {
    if (c[i].nodeType == 1) return true; 
  }

  return false;
}

The idea is to allow browser DOM parser to decide if provided string looks like an HTML or not. As you can see it simply checks for ELEMENT_NODE (nodeType of 1).

I made a couple of tests and looks like it works:

Click to copy

isHTML('<a>this is a string</a>') // true
isHTML('this is a string')        // false
isHTML('this is a <b>string</b>') // true

This solution will properly detect HTML string, however it has side effect that img/vide/etc. tags will start downloading resource once parsed in innerHTML.

Method #2. Another method uses DOMParser and doesn't have loading resources side effects:

Click to copy

function isHTML(str) {
  var doc = new DOMParser().parseFromString(str, "text/html");
  return Array.from(doc.body.childNodes).some(node => node.nodeType === 1);
}

_{Notes:
1. Array.from is ES2015 method, can be replaced with [].slice.call(doc.body.childNodes).
2. Arrow function in some call can be replaced with usual anonymous function.}

Check if a string is html or not

Tags:

Javascript

Regex

Related

Recent Posts