Query: adding additional operator changes meaning of preceding operators

This is a corner case that occurs when ascending operators are interleaved between descending operators. This case falls into an undocumented grey area.

In the Details and Options section of the Query documentation, we read the following special rule:

When one or more descending operators are composed with one or more ascending operators (e.g. desc /* asc), the descending part will be applied, then subsequent operators will be applied to deeper levels, and lastly the ascending part will be applied to the result.

The statement appears to apply to the case at hand. But appearances can be deceiving. Consider:

Dataset;

Dataset`DescendingQ[Select[EvenQ]]    (* True  *)
Dataset`DescendingQ[Append[1]]        (* False *)
Dataset`DescendingQ[Sort]             (* True  *)

This means that the operator Select[EvenQ] /* Append[1] /* Sort has the form desc /* asc /* desc, with descending and ascending elements interleaved.

The documentation (weakly!) suggests that it only applies when all the descending operators precede all of the ascending operators. Since our ascending operator is sandwiched between two descending operators, the special rule is not applicable.

We can make the special rule apply by replacing the descending operator Sort with the ascending operator Query[Sort]:

Dataset`DescendingQ[Query[Sort]]
(* False *)

Range[5] // Query[Select[EvenQ] /* Append[1] /* Query[Sort], f]
(* {1, f[2], f[4]} *)

This is the result we seek.

Commentary

Since the special rule is inapplicable, how are the interleaved operators interpreted? The documentation is silent, but apparently the resulting composition is treated as a single ascending operator. Thus, when a descending operator follows an ascending operator then all the descending operators lose their special status -- even the ones that precede the ascending operator.

This explains the result we see:

Range[5] // Query[Select[EvenQ] /* Append[1] /* Sort, f]

(* {1} *)

Since the level one operator is being treated as ascending in its entirety, it is applied after f. Therefore, Select[EvenQ] returns nothing since none of the elements {f[1], f[2], ...} are EvenQ. Append[1] acts upon an empty list, and the single element result is left unchanged by Sort.

Notwithstanding the documentation being silent on the matter, I am tempted to call this behaviour a bug. At the very least, I feel the operator composition rules would be simpler if leading descending operators always maintained their special status. It stands to reason that any descending operators that follow an ascending operator lose that status.

But until and unless the rules are ever changed like this, we must keep an eye out for this very subtle case.

Hackery

If we want to try out the revised rule to see how it feels, we can hack the V11.0.1 definition:

BeginPackage["Dataset`Query`PackagePrivate`", {"Dataset`"}]

comp[ops:RightComposition[Longest[desc__ ? VectorDescendingQ], asc__]] :=
  vectorDescendingOp[RightComposition[desc]] /* chainedOp[RightComposition[asc]];

comp[ops:RightComposition[desc___ ? VectorDescendingQ, descScalar_ ? ScalarDescendingQ, asc__]] :=
  scalarDescendingOp[desc /* descScalar] /* chainedOp[RightComposition[asc]];

EndPackage[]

so then:

Range[5] // Query[Select[EvenQ] /* Append[1] /* Sort, f]

(* {1, f[2], f[4]} *)

Naturally, this hack is unsanctioned and brittle.


I don't know why the RightComposition don't work here,but the Composition will give a result as we expected.

Range[5] // Query[Select[EvenQ] /* Sort@*Append[1], f]

{1, f[2], f[4]}

Tags:

Dataset

Query