XQuery/XPath: Using count() and max() function for return of element with highest count
This may help:
declare default element namespace 'books';
(for $name in distinct-values($doc/books/*/*/name)
let $entries := $doc/books/*[data(*/name) = $name]
order by count($entries) descending
return $entries/*/name)[1]
Here is a pure XPath 2.0 expression, admittedly not for the faint-hearted:
(for $m in max(for $n in distinct-values(/*/b:book/(b:author | b:editor)
/b:name/concat(b:fname, '|', b:lname)),
$cnt in count(/*/b:book/(b:author | b:editor)
/b:name[$n eq concat(b:fname, '|', b:lname) ])
return $cnt
),
$name in /*/b:book/(b:author | b:editor)/b:name,
$fullName in $name/concat(b:fname, '|', b:lname),
$count in count( /*/b:book/(b:author | b:editor)
/b:name[$fullName eq concat(b:fname, '|', b:lname)])
return
if($count eq $m)
then $name
else ()
)[1]
where the prefix "b:"
is associated with the namespace "books"
.
XSLT 2.0 - based verification:
<xsl:stylesheet version="2.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:b="books">
<xsl:output omit-xml-declaration="yes" indent="yes"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:sequence select=
"(for $m in max(for $n in distinct-values(/*/b:book/(b:author | b:editor)
/b:name/concat(b:fname, '|', b:lname)),
$cnt in count(/*/b:book/(b:author | b:editor)
/b:name[$n eq concat(b:fname, '|', b:lname) ])
return $cnt
),
$name in /*/b:book/(b:author | b:editor)/b:name,
$fullName in $name/concat(b:fname, '|', b:lname),
$count in count( /*/b:book/(b:author | b:editor)
/b:name[$fullName eq concat(b:fname, '|', b:lname)])
return
if($count eq $m)
then $name
else ()
)[1]
"/>
</xsl:template>
</xsl:stylesheet>
when this transformation is applied on the provided XML document:
<books xmlns="books">
<book ISBN="i0321165810" publishername="OReilly">
<title>XPath</title>
<author>
<name>
<fname>Priscilla</fname>
<lname>Walmsley</lname>
</name>
</author>
<year>2007</year>
<field>Databases</field>
</book>
<book ISBN="i0321165812" publishername="OReilly">
<title>XQuery</title>
<author>
<name>
<fname>Priscilla</fname>
<lname>Walmsley</lname>
</name>
</author>
<editor>
<name>
<fname>Lisa</fname>
<lname>Williams</lname>
</name>
</editor>
<year>2003</year>
<field>Databases</field>
</book>
<publisher publishername="OReilly">
<web-site>www.oreilly.com</web-site>
<address>
<street_address>hill park</street_address>
<zip>90210</zip>
<state>california</state>
</address>
<phone>400400400</phone>
<e-mail>[email protected]</e-mail>
<contact>
<field>Databases</field>
<name>
<fname>Anna</fname>
<lname>Smith</lname>
</name>
</contact>
</publisher>
</books>
the wanted, correct name
element is selected and output:
<name xmlns="books">
<fname>Priscilla</fname>
<lname>Walmsley</lname>
</name>
I've always felt this was an omission in XPath: the max() and min() functions return the highest/lowest value, whereas what you usually want is the object(s) in a collection that have the highest/lowest value for some expression. One solution is to sort the objects on that value and take the first/last from the list, which seems inelegant. Computing the min/max and then selecting the items whose value matches this seems equally unappealing. In Saxon there has long been a pair of higher-order extension functions saxon:highest() and saxon:lowest() which take a sequence and a function, and return the item(s) from the sequence having the lowest or highest values of the function result. The good news is that in XPath 3.0 you can write these functions yourself (in fact, they are given as example user-written functions in the spec).