Evaluation of Alternatives

The two examples in this question relate to two different aspects of pattern matching. I will start with the simpler to understand and intentional aspect, which is the second example.

g[2] /. g[ 1 + (1|other) ] -> post
(* g[2] *)

In the above, the pattern doesn't match, and it can never match. g[2] has one argument. Since Plus is OneIdentity, 2 catch match Plus[2], Plus[Plus[2]], and so on, but it can never match a 2-argument Plus, which is what the pattern contains. The "expanded" pattern g[1 + 1] | g[1 + other] -> post seems to work, only because it evaluates:

In[20]:= g[1 + 1] | g[1 + other] -> post
Out[20]= g[2] | g[1 + other] -> post

It wouldn't matter if we used RuleDelay, because that only prevents the RHS from evaluating. Moreover, if we used HoldPattern to prevent the LHS from evaluating, it doesn't match:

In[22]:= g[2] /. HoldPattern[g[1 + 1] | g[1 + other]] -> post
Out[22]= g[2]

This may not be explicitly documented, but it is absolutely the design: the pattern matcher accounts for attributes, sequences, and so forth, but it does not simulate evaluation. Beyond the fact that evaluation may have undesired side effects, we would have a true circularity problem given that evaluation is done via pattern-matching! Pattern matching compares the structure of one expression against another structure, to see if rules (replacement, evaluation, etc.) should fire.

Your first example is an occurence of a long-standing implementation issue. It's even more suprising given the following:

In[23]:= SetAttributes[f, Flat];
f[a, b, c] /. f[a, f[b, c] | o_ /; o === other] -> post
Out[24]= post

Weird, right? There are a few issues here, but the main one is that the pattern matcher treats literal patterns like f[b,c] and general patterns like o_ slightly differently (mainly for reasons of efficiency), but sometimes these two sets of rules conflict with each other and we end up with inconsistencies like the above. We have some improvements in the pipeline and ideas on how to completely clean this up, but unfortunately completely squashing all these issues is a ways off.


I think an acceptable solution is to Thread over Alternatives:

Basic solution:

SetAttributes[f, Flat];
f[a, b, c] /. Thread[f[a, f[b, c] | other], Alternatives] -> post
post

Though, it won't be very helpful in more complex situations: f[a | b, f[b, c | h]].

General solution (experimental)

tupplesOver[
 f[a | g, f[b, c | h] | other],
 Alternatives
]
f[a, b, c] | f[g, b, c] | f[a, b, h] | f[g, b, h] | f[a, other] |  f[g, other]

Maybe it will be better shown when we drop Flat attribute from f:

ClearAll[f];
tupplesOver[
 f[a | g, f[b, c | h] | other],
 Alternatives]
f[a, f[b, c]] | f[g, f[b, c]] | f[a, f[b, h]] | f[g, f[b, h]] | 
 f[a, other] | f[g, other]

Code

tupplesOver[expr_, head_] := Module[{temp, myAlternatives},
  temp = myAlternatives[expr /. head -> myAlternatives];
  SetAttributes[myAlternatives, Flat];

  head @@ NestWhile[
    Function[arg,
     Module[{pos},
      pos = First@Reverse@Sort@Position[arg, _myAlternatives];
      MapAt[
       Thread[#, myAlternatives, pos[[{-1}]]] &,
       arg,
       Most@pos
       ]
      ]
     ],
    temp,
    (Count[#, myAlternatives, \[Infinity], Heads -> True] > 1) &
    ]
  ]

IMO it may be desired although I wouldn't say expected. Maybe someone more experienced will tell us why things are the way they are :)


After discussing with Kuba, I conjecture the following:

Patterns involving alternatives are not evaluated further when attempting to match each alternative

That is, somePattern[..., a|b, ...] originally evaluates as if a|b is a black box. Then, during pattern matching, the pattern does not evaluate any further when a|b is replaced by a and b in turn.

To the best of my knowledge, this behavior is not documented anywhere.

Personally, I find this behavior somewhat surprising -- in particular, I would intuitively expect the following patterns to be completely equivalent:

somePattern[..., a|b, ...]
somePattern[..., a, ...] | somePattern[..., b, ...]

As far as I can tell, nothing in the documentation suggests the two patterns above should not be equivalent. I'd be very interested to hear whether anybody has conflicting intuition, or any idea why Alternatives would be built this way.

I suppose there might be guiding principles behind deciding whether expressions are to be evaluated further, that could motivate this implementation. The only other case I know of where an expression is (meaningfully) not evaluated further is in the case of a global variable changing value which would influence the evaluation via a Condition. (See my previous question, How is LHS = RHS; … ; LHS (nontrivially) different from … ; RHS. ) It seems to me that this sort of behavior is not documented well anywhere.


Forcing equivalence of alternatives

As stated above, I intuitively expect the following patterns to be equivalent:

somePattern[..., a|b, ...]
somePattern[..., a, ...] | somePattern[..., b, ...]

That is, I expect one pattern to match a given expression if and only if the other pattern also matches that expression.

Inspired by Kuba's attempts, I create a function which can be applied to patterns to force the behavior I expect -- essentially that patterns thread over Alternatives :

threadAlt[patt_] := Flatten[ 
    patt //. (g : Except[Alternatives])[a___, x_Alternatives, b___] 
                     :> (g[a, #, b] &) /@ x ,
    Infinity, Alternatives]

I believe it functions identically to Kuba's topplesOver if generalized to

threadAlt[patt_, head_] := Flatten[ 
    patt //. (g : Except[head])[a___, head, b___] 
                     :> (g[a, #, b] &) /@ x ,
    Infinity, head]

In particular, this 'solves' the examples in the original question:

SetAttributes[f, Flat];

threadAlt@f[a, f[b,c]|other]
(* f[a, b, c] | f[a, other] *)

f[a,b,c] /. f[a, f[b,c]|other] -> post
(* f[a,b,c] *)

f[a,b,c] /. threadAlt@f[a, f[b,c]|other] -> post
(* post *)

Edit: Loose ends

As mentioned above, I don't see anything in the documentation that suggests Alternatives should function differently from my intuition (i.e., how my threadAlt function forces it to behave). I'm very curious about why Alternatives might be implemented the way it is, rather than the way I suggest.

In particular, I'm curious why Mathematica would not further evaluate a pattern when attempting to match each alternative. See discussion above.