Enforcing correct variable bindings and avoiding renamings for conflicting variables in nested scoping constructs

Scoping constructs, lexical scoping and variable renamings

It pays off to understand a bit deeper how the scoping constructs work and what happens behind the scenes when you execute one. In addition to the documentation, this was discussed in part here, but let us present some summary.

When the lexical scoping construct Sc[vars, body] executes (where Sc can stand for constructs such as With,Module, Function), evaluation roughly happens via the following steps (or at least this represents our current understanding of it):

First, the list of local variables vars is analyzed.
The body is analyzed, and the presence of inner scoping constructs is tested. This test is performed verbatim, so that, to be detected, some inner scoping construct ScInner has to be present in body in the form ScInner[innerVars, innerBody]. If the inner scoping construct is dynamically generated at run-time (via ScInner @@ ... or otherwise), it is not detected by Sc during this stage.
If some inner scoping constructs are found where some variables conflict with vars, then Sc renames them. It is important to stress that it is Sc that does these renamings in the inner scoping constructs. Indeed, those are inert during that stage (since Sc has HoldAll attribute and so body is kept unevaluated), so Sc is the only function in a position to do those renamings.
The actual variable binding happens. The body is searched for instances of vars, and those instances are lexically bound to the variables.
Depending on the nature of the scoping construct, further actions may be performed. Function does nothing, With performs the replacements of symbols (variables) with their values in body (according to bindings), while Module creates var$xxx variables (according to the bindings, both in the initialization and body), and then performs variables initializations.
The code in body is actually allowed to evaluate.

How to fool scoping constructs

From the above description, it is clear that, if one wants to avoid renamings for some inner lexical scoping construct ScInner[innerVars, innerBody] for whatever reason, one has to dynamically generated this code, so that it is not present inside Sc verbatim. Again, depending on the situation, one may or may not want to evaluate innerVars and innerBody.

More often than not one wants to prevent such evaluation, so it is typical to use something like

With[{y = boundToZ}, With @@ Hold[{y = z}, boundToZ]]

With[{y = boundToZ}, Hold[{y = z}, boundToZ] /. Hold -> With]

or anything else that would prevent the innerVars or innerBody from unwanted early evaluation.

Here is a more meaningful example. The following is a new scoping construct, which executes some code code with a variable var bind to a current value extracted from a Java iterator object:

ClearAll[javaIterate];
SetAttributes[javaIterate, HoldAll];
javaIterate[var_Symbol, code_][iteratorInstance_?JavaObjectQ] :=
    While[iteratorInstance@hasNext[], 
        With @@ Hold[{var = iteratorInstance@next[]}, code]
    ];

The iteratorInstance is expected to be a Mathematica reference for Java object implementing Iterator interface. The variable var is bound to the current value of the iterator (extracted via iteratorInstance@next[]), using With. This is non-trivial, since we construct this With from pieces, and therefore generate lexical binding of this var to the occurrences of var in code, at every iteration of the While loop. In this case, the outer protecting scoping construct is actually SetDelayed. And we need the construct With @@ Hold[...] to prevent variable var renaming, which is exactly what we don't want here.

However, there are cases, where we do want some or all of innerVars or innerBody to evaluate before the binding stage for the inner scoping constructs. The case at hand falls into this category. In such a case, perhaps the most straight-forward way is to use Sc @@ {innerVars, innerBody}, which is what acl did in his answer.

The case at hand

It is now clear why this solution works:

Module[{x, expr},
 expr = 2 x;
 Function @@ {x, expr}
]

Function[x$5494, 2 x$5494]

Since there wasn't a line Function[x,...] present verbatim, Module did not detect it. And since we do want the variable and body to evaluate before Function performs the variable bindings, the second version (Function @@ {...}) has been used.

You will note that Evaluate is not needed because List is does not have the HoldAll attribute. This specific syntax is not the only approach. For example h[x, expr] /. h -> Function or ReplacePart[{x, expr}, 0 -> Function] would also work because there is not an explicit Function[x, . . .] in the code.

It is instructive to realize that this version works too:

Module[{x, expr}, 
   expr = 2 x;
   Function[Evaluate[x], Evaluate[expr]]
]

while Function[...] is present here, the presence of extra Evaluate around x in Function made it impossible for Module to detect the conflict. Of course, there are many other ways one can achieve the same effect.

It is now also clear why the following will not work:

Module[{x, expr},
 expr = 2 x;
 Function[x, z] /. z -> expr
]

Function[x, 2 x$151]

The point is, the substitution z -> expr happens only at the stage when the body of the Module is evaluated, while the binding stage happens earlier (as described above). During the binding stage, Module detects the name conflict all right, and of course renames. Only then is x converted into a newly created x$151, and only after all that the code inside Module executes - by which time it is too late since the symbols inside Function and expr are different.

The case of `Block`

Block is a natural approach when guarding against global values, but as Szabolcs comments it must be used with care. Block is not seen a scoping construct for the purposes of the automatic renaming described in the tutorial. You can also see some additional relevant discussion here. Because of that you will not get the "protection" that you may be accustomed to. Using Szabolcs's example:

f[z_] := Block[{x, expr}, expr = 2 x + z; Function[x, Evaluate[expr]]]

f[x]

Function[x, 3 x]

Note that this function will triple its argument rather than doubling and adding Global`x which may or may not be what you were expecting. Such injection is often very useful but at the same time if you are accustomed to the automatic renaming behavior (even if you are unaware of the mechanism) this may come as a surprise.

I think this should be OK

Module[{x, expr},
 expr = 2 x;
 Function @@ {x, expr}
 ]

(because {x,2*expr} gets evaluated before Function replaces the List head)

You can also just replace Module with Block:

Block[{x, expr},
   expr = 2 x;
   Function[x, Evaluate[expr]]]

Enforcing correct variable bindings and avoiding renamings for conflicting variables in nested scoping constructs

Scoping constructs, lexical scoping and variable renamings

How to fool scoping constructs

The case at hand

The case of `Block`

Tags:

Scoping

Pure Function

Functions

Faq

Function Construction

Related

Recent Posts

Enforcing correct variable bindings and avoiding renamings for conflicting variables in nested scoping constructs

Scoping constructs, lexical scoping and variable renamings

How to fool scoping constructs

The case at hand

The case of Block

Tags:

Scoping

Pure Function

Functions

Faq

Function Construction

Related

The case of `Block`