XPath - Difference between node() and text()
text()
and node()
are node tests, in XPath terminology (compare).
Node tests operate on a set (on an axis, to be exact) of nodes and return the ones that are of a certain type. When no axis is mentioned, the child
axis is assumed by default.
There are all kinds of node tests:
node()
matches any node (the least specific node test of them all)text()
matches text nodes onlycomment()
matches comment nodes*
matches any element nodefoo
matches any element node named"foo"
processing-instruction()
matches PI nodes (they look like<?name value?>
).- Side note: The
*
also matches attribute nodes, but only along theattribute
axis.@*
is a shorthand forattribute::*
. Attributes are not part of thechild
axis, that's why a normal*
does not select them.
This XML document:
<produce>
<item>apple</item>
<item>banana</item>
<item>pepper</item>
</produce>
represents the following DOM (simplified):
root node element node (name="produce") text node (value="\n ") element node (name="item") text node (value="apple") text node (value="\n ") element node (name="item") text node (value="banana") text node (value="\n ") element node (name="item") text node (value="pepper") text node (value="\n")
So with XPath:
/
selects the root node/produce
selects a child element of the root node if it has the name"produce"
(This is called the document element; it represents the document itself. Document element and root node are often confused, but they are not the same thing.)/produce/node()
selects any type of child node beneath/produce/
(i.e. all 7 children)/produce/text()
selects the 4 (!) whitespace-only text nodes/produce/item[1]
selects the first child element named"item"
/produce/item[1]/text()
selects all child text nodes (there's only one - "apple" - in this case)
And so on.
So, your questions
- "Select the text of all items under produce"
/produce/item/text()
(3 nodes selected) - "Select all the manager nodes in all departments"
//department/manager
(1 node selected)
Notes
- The default axis in XPath is the
child
axis. You can change the axis by prefixing a different axis name. For example://item/ancestor::produce
- Element nodes have text values. When you evaluate an element node, its textual contents will be returned. In case of this example,
/produce/item[1]/text()
andstring(/produce/item[1])
will be the same. - Also see this answer where I outline the individual parts of an XPath expression graphically.