V10's Operator Forms - what are they good for?
For me the operator forms of Map
and Apply
will probably provide the most important benefits in terms of code readability. Often I need to apply a sequence of transformations to some data, and I am fond of infix notation for this purpose. For example I find
a ~Position~ 0 ~SortBy~ Last
more readable than the "conventional"
SortBy[Position[a, 0], Last]
because I do not have to scan backwards and forwards in the expression to match the SortBy
with the Last
.
This is only possible when using functions which take the data as their first argument. Because Map
and Apply
take the data as their second argument, they do not fit easily into the left-to-right infix syntax. If my final step is to map Max
across the list I would need to use something like
a ~Position~ 0 ~SortBy~ Last ~(#2 /@ #1 &)~ Max
(if I was determined to stick with infix), or more likely
a ~Position~ 0 ~SortBy~ Last // Max /@ # &
In both cases I am having to use a pure function just to get the arguments of Map
in the correct order. In practice I would probably abandon the left-to-right principle and put the last operation at the beginning of the expression:
Max /@ (a ~Position~ 0 ~SortBy~ Last )
The operator form means that I can chain the transformations in a very natural way:
a // Position[0] // SortBy[Last] // Map[Max]
I would have liked to have more experience with the operator forms before this question was asked as I am short on examples, and I'm sure my opinion will evolve over time. Nevertheless I think I have enough familiarity with similar syntax to provide some useful comments.
Taliesin Beynon provided some background for this functionality in Chat:
Operator forms have turned out to be a huge win for writing readable code. Unfortunately I can't remember whether it was Stephen or me who first suggested them, so I don't know who should get the credit :). Either way it was a major (and risky) decision, and I had to argue with a lot of people in the company who remained skeptical, so credit goes to Stephen for just pushing it through. But they were motivated by the needs of Dataset's query language, which is an interesting historical detail I think.
We see that m_goldberg is correct in seeing operator forms as being important to Dataset
.
Taliesin also claims that operator forms are "a huge win" for readability. I agree with this and have been a proponent of SubValues definitions, which is basically what "operator forms" are. I also like Currying(1),(2) though I haven't embraced it to the same degree.
You comment that operator forms only save a few characters over anonymous functions and this is usually true, but these characters, and more importantly the semantics behind them, are nevertheless significant. Being able to treat functions with partially specified parameters as functions (Currying) frees us from the cruft or baggage of a lot of Slot
and Function
use. Surely these are easier to read and write:
fn[1] /@ list (* fn[1, #] & /@ list *)
SortBy[list, Extract @ 2] (* SortBy[list, Extract[#, 2] &] *)
Note that I did not choose to use the operator form of SortBy
here.
Since Mathematica uses a generally functional language these kinds of operations are frequent, which mean that these effects quickly compound. Code that contains multiple Slot
Functions can be quite hard to read as it is not always clear which #
belongs to which &
. As a hurriedly contrived example consider this snippet:
(SortBy[#, Mod[#, 5] &] &) /@ (Append[#, 11] &) /@ Partition[Range@9, 3]
If we first provide "operators forms" for functions that do not presently have them:
partition[n_][x_] := Partition[x, n]
mod[n_][m_] := Mod[m, n]
Then write the line above using such forms in all applicable places:
SortBy[mod @ 5] /@ Append[11] /@ partition[3] @ Range @ 9
This is a considerable streamlining of syntax and much easier to read.
The example above is also semantically simpler:
Unevaluated[(SortBy[#1, Mod[#1, 5] &] &) /@ (Append[#1, 11] &) /@
Partition[Range[9], 3]] // LeafCount
Unevaluated[SortBy[mod @ 5] /@ Append[11] /@ partition[3] @ Range @ 9] // LeafCount
20 11
Theoretically that could pay dividends in performance though I am uncertain of the present reality of this. Some operations are slower, possibly due to an inability to compile, while others are faster. However I believe that this simplification opens the door for future optimizations.
I find the value of the new operator forms becomes critical when working with datasets. Consider
titanic = ExampleData[{"Dataset", "Titanic"}];
titanic[Count[#], "survived"] & /@ {True, False, _Missing}
{500, 809, 0}
Derive a data set for analyzing the survival of very young passengers.
cutoff = 8;
youngest = titanic[All, {"age", "survived"}][Select[#age <= cutoff &]];
pts =
Table[
Function[{x, y}, youngest[Select[#age == x && #survived == y &] /* Length]][x, y],
{x, Range @ cutoff}, {y, {True, False}}] // Transpose;
ListPlot[Tooltip /@ pts,
PlotStyle -> {Black, Red},
PlotMarkers -> {Automatic, 14},
PlotLegends -> {"Survived", "Perished"},
AxesLabel -> {"Age", "Count"}]
Without the new operator forms for functions like Count
and Select
, working with datasets would much more awkward. It is only speculation on my part, but I believe datasets (i.e., structured data) provided the motivation for implementing the new forms.