When to use NodeIterator
It's slow for a variety of reasons. Most obviously is the fact that nobody uses it so quite simply far less time has been spent optimizing it. The other problem is it's massively re-entrant, every node having to call into JS and run the filter function.
If you look at revision three of the benchmark, you'll find I've added a reimplementation of what the iterator is doing using getElementsByTagName("*")
and then running an identical filter on that. As the results show, it's massively quicker. Going JS -> C++ -> JS is slow.
Filtering the nodes entirely in JS (the getElementsByTagName
case) or C++ (the querySelectorAll
case) is far quicker than doing it by repeatedly crossing the boundary.
Note also selector matching, as used by querySelectorAll
, is comparatively smart: it does right-to-left matching and is based on pre-computed caches (most browsers will iterate over a cached list of all elements with the class "klass", check if it's an a
element, and then check if the parent is a div
) and hence they won't even bother with iterating over the entire document.
Given that, when to use NodeIterator? Basically never in JavaScript, at least. In languages such as Java (undoubtedly the primary reason why there's an interface called NodeIterator), it will likely be just as quick as anything else, as then your filter will be in the same language as the filter. Apart from that, the only other time it makes sense is in languages where the memory usage of creating a Node object is far greater than the internal representation of the Node.
NodeIterator
(and TreeWalker
, for that matter) are almost never used, because of a variety of reasons. This means that information on the topic is scarce and answers like @gsnedders' come to be, which completely miss the mark. I know this question is almost a decade old, so excuse my necromancy.
- Initiation & Performance
=
It is true that the initiation of a
NodeIterator
is waaay slower than a method likequerySelectorAll
, but that is not the performance you should be measuring.
The thing about NodeIterator
s is that they are live-ish in the way that, just like an HTMLCollection
or live NodeList
, you can keep using the object after initiating it once.
The NodeList
returned by querySelectorAll
is static and will have to be re-initiated every time you need to match newly added elements.
This version of the jsPerf puts the NodeIterator
in the preparation code. The actual test only tries to loop over all newly added elements with iter.nextNode()
. You can see that the iterator is now orders of magnitudes faster.
- Selector performance
=
Okay, cool. Caching the iterator is faster. This version, however, shows another significant difference. I've added 10 classes (
done[0-9]
) that the selectors shouldn't be matching. The iterator loses about 10% of its speed, while the querySelectors lose 20%.
On the other hand, this version, shows what happens when you add another div >
at the start of the selector. The iterator loses 33% of its speed, while the querySelectors got a speed INCREASE of 10%.
Removing the initial div >
at the start of the selector like in this version shows that both methods become slower, because they match more than earlier versions. Like expected, the iterator is relatively more performant than the querySelectors in this case.
This means that filtering on basis of a node's own properties (its classes, attributes, etc.) is probably faster in a NodeIterator
, while having a lot of combinators (>, +, ~, etc.) in your selector probably means querySelectorAll
is faster.
This is especially true for the
(space) combinator. Selecting elements with querySelectorAll('article a')
is way easier than manually looping over all parents of every a
element, looking for one that has a tagName
of 'ARTICLE'
.
P.S. in §3.2, I give an example of how the exact opposite can be true if you want the opposite of what the space combinator does (exclude a
tags with an article
ancestor).
3 Impossible selectors
3.1 Simple hierarchical relationships
Of course, manually filtering elements gives you practically unlimited control. This means that you can filter out elements that would normally be impossible to match with CSS selectors. For example, CSS selectors can only "look back" in the way that selecting div
s that are preceded by another div
is possible with div + div
. Selecting div
s that are followed by another div
is impossible.
However, inside a NodeFilter
, you can achieve this by checking node.nextElementSibling.tagName === 'DIV'
. The same goes for every selection CSS selectors can't make.
3.2 More global hierarchical relationships
Another thing I personally love about the usage of NodeFilter
s, is that when passed to a TreeWalker
, you can reject a node and its whole sub-tree by returning NodeFilter.FILTER_REJECT
instead of NodeFilter.FILTER_SKIP
.
Imagine you want to iterate over all a
tags on the page, except for ones with an article
ancestor.
With querySelectors, you'd type something like
let a = document.querySelectorAll('a')
a = Array.prototype.filter.call(a, function (node) {
while (node = node.parentElement) if (node.tagName === 'ARTICLE') return false
return true
})
While in a NodeFilter
, you'd only have to type this
return node.tagName === 'ARTICLE' ? NodeFilter.FILTER_REJECT : // ✨ Magic happens here ✨
node.tagName === 'A' ? NodeFilter.FILTER_ACCEPT :
NodeFilter.FILTER_SKIP
In conclusion
You don't initiate the API every time you need to iterate over nodes of the same kind. Sadly, that assumption was made with the question being asked, and the +500 answer (giving it a lot more credit) doesn't even address the error or any of the perks NodeIterator
s have.
There's two main advantages NodeIterator
s have to offer:
- Live-ishness, as discussed in §1
- Advanced filtering, as discussed in §3
(I can't stress enough how useful theNodeFilter.FILTER_REJECT
example is)
However, don't use NodeIterator
s when any of the following is true:
- Its instance is only going to be used once/a few times
- Complex hierarchical relationships are queried that are possible with CSS selectors
(i.e.body.no-js article > div > div a[href^="/"]
)
Sorry for the long answer :)