Why does removing a function and defining it on the same line not work?

Executing Trace on an expression reveals what is actually happening:

Trace[Remove@x; x = 1]
(*{Remove[Removed[x]];Removed[x]=1,{Remove[Removed[x]],Null},{Removed[x]=1,1},1}*)

"the Wolfram Language always reads in a complete input expression, and interprets the names in it, before it executes any part of the expression." (see: https://reference.wolfram.com/language/tutorial/SettingUpWolframLanguagePackages.html)

The whole line is one expression with the head CompoundExpression. The variable x gets replaced with Removed["x"] throughout that expression. Then it seems that Removed["x"] is assigned the value 1 (see Leonid Shifrin's answer to this question for more details about Removed symbols).

However, when you write it on two different lines, it is interpreted as two separate expressions and hence you don't face this problem.


I use the word internal symbol rather than just symbol, to make it easier to distinguish between the name of a symbol and the internal representation of the symbol inside the kernel. I felt I should clarify this as I have not seen this terminology used elsewhere (and WRI may not agree this is a useful concept).

The "Theory"

First let's consider what happens when lines of code are evaluated.

The Front End reads code and sends strings corresponding to expressions to the kernel for evaluation. If you evaluate multiple lines of code, it keeps reading until

  • it encounters a newline, and
  • the expression it is reading is complete when it sees this newline

For each expression found in this way, the front end sends a string to the kernel corresponding to this expression (the string can also correspond to the expression in the sense that the string corresponds to boxes that represent the expression, but that is not very relevant). The kernel then converts this string to an internal representation of the expression.

I think that in this internal representation, symbols are not represented as strings, one reason being that I think this would not lead to good performance. I think it is better to think (because it seems this is a helpful model of things) of the internal representation of a symbol as an object that has its name (a string) and/or perhaps an identifier as attributes. Another thing that can be looked up for any given symbol is whether it has been removed or not and what rules are associated with it. Note that in this (speculative) model, objects that have the same values for their attributes (i.e. the symbols have the same name), can still be distinguished by the kernel. It does not really matter where this information is stored, in the attributes of an object or perhaps some global data structure. What matters is that in this model there is not a one-to-one correspondence between strings and internal representations of symbols.

Normally there is at most one internal symbol that has any given name (e.g. "f"). In particular, there is at most one internal symbol with that name that has not been removed. When you evaluate Remove[f], you "remove" the unique internal symbol that has the name "f". When a new expression that contains f now is sent to the kernel (as a string), the kernel creates a new internal symbol with name "f" and this is now the unique internal symbol with name "f" that has not been removed. The kernel now uses this new internal symbol, when it makes an internal representation for the expression and for every expression containing f that it receives after that (until it is removed again). The internal symbol that has been removed can still be used, but its definitions are cleared (I show an example later).

Your examples

I have slightly modified the examples. My explanations are speculative in that they are based on the speculations above.

Snippet without a newline

In

Remove[f]; f[x_] := 1;
f[1]
 f[1]

the expression corresponding to this whole line, with head CompoundExpression, i.e. CompoundExpression[Remove[f], f[x_]:=1, Null], is sent to the kernel at once. This happens because the Front End keeps reading until it encounters a newline and the expression is complete when it sees this newline. The string corresponding to this expression is then sent to the kernel and the kernel creates an internal representation.

In this internal representation both instances of f correspond to the same internal symbol (lets call this iF1). When the kernel evaluates the expression, it encounters the internal representation of Remove[f], which we can think of as Remove[iF1] and the internal symbol iF1 is removed. Next, the kernel will evaluate f[x_] := 1;, which we can think of as iF1[x_] := 1;, so a rule is associated with the internal symbol iF1. Now because iF1 is removed, this definition will under normal circumstances not be relevant, but as we will see later the definition is indeed made and we can still access such definitions.

When we send a new expression f[1] to the kernel, the kernel sees that there is no symbol with name "f" that has not been removed, so it creates a new one, say iF2. It then creates and evaluates the internal representation, which we can think of as iF2[1]. No rule is associated with iF2, so this evaluates to iF2[1], which is displayed as f[1].

Snippet with a newline

However, in

Remove[f]; 
f[x_] := 1;
f[1]
1

The front end reads until it encounters the first newline. At this point, what it has read so far corresponds to a complete expression, i.e. Remove[f];. So the string corresponding to Remove[f]; is sent to the kernel. The kernel creates an internal representation, which we can think of as Remove[iF3];. The internal symbol iF3 is removed.

Next the front end continues reading where it left of. It reads until it finds a newline at which the expression it is reading is complete. The results in that the next line, f[x_] := 1; is sent to the kernel. Because there is no symbol with name f that has not been removed (because iF3 was removed ), the kernel creates a new one, iF4. The kernel evaluates the internal representation of the expression, which we can think of as iF4[x_] := 1;. So a rule is associated with iF4.

Now when we send a new expression f[1], the kernel looks up the unique symbol that has not been removed with name f, which is iF4 and it evaluates the internal respresentation, which we can think of as iF4[1] which evaluates to 1.

Verification of the explanation using handlers

Perhaps I should have mentioned earlier that some of the claims I made above can be verified by using $NewSymbol. I just saw in this interesting answer that we can also track when symbols are removed, so let's verify how the examples work. I modified the examples to include symbols that help us to see which line is being evaluated. Once we are confident that that is how the Front End works, we could also split up the cells so that they each have only one line, (or similar), but I feel this warrants additional verification first.

Make sure you start with a new kernel. The first time output is generated using a fresh kernel, new symbols are generated, so in order to not let this clutter the output, I include a line containing only 1, before we start tracking symbols (also evaluate this "cell" when we analyse the second example).

1
With[{h = #}, 
    Internal`AddHandler[h, 
     Print@(h -> {##}) &]] & /@ {"RemoveSymbol", "NewSymbol"};

The first example then gives us

Remove[f]; f[x_] := 1
secondLine; f[1]
(*prints: 
NewSymbol->{{f,Global`}}
NewSymbol->{{x,Global`}}
RemoveSymbol->{{f,Global`}}
NewSymbol->{{secondLine,Global`}}
NewSymbol->{{f,Global`}}
*)
f[1]

It was desirable to start with a new kernel, because in this case the are no internal symbols with the names f and x yet. The prints show us that indeed first both these internal symbols are generated (even though x only first appears to the right of a semicolon). Only after that is the internal representation of the first line evaluated and is an internal symbol with the name f removed. We see that a new symbol with the name f is generated, but only when the second line is being evaluated. So indeed the assignment is made to the old internal symbol f that has already been removed. So this assignment has no effect on the new symbol with the name f, so that f[1] is the displayed output.

With a fresh kernel that has handlers and that has already generated the symbols associated with generating the first output, the second example gives us

Remove[f];
secondLine; f[x_] := 1;
thirdLine; f[1]
(*prints:
NewSymbol->{{f,Global`}}
RemoveSymbol->{{f,Global`}}
NewSymbol->{{secondLine,Global`}}
NewSymbol->{{f,Global`}}
NewSymbol->{{x,Global`}}
NewSymbol->{{thirdLine,Global`}}
*)
1

In this case, we see that indeed a (internal) symbol with the name f is removed (in the evaluation of the first line) before a definition is associated with a symbol with the name f (in the evaluation of the second line). The internal symbol used in evaluation of the third line is the same one as that which is used in the second line, as nothing has been removed since we started evaluating the second line. So the definition made in the second line affects the evaluation of f[1] in the third line and the output is 1.

Addendums

Addendum 1: Removed symbols may still exist

The behaviour in the following two examples correspond to the "theory" above. The first example shows that we can still refer to removed symbols. The example also shows that there may be multiple internal symbols with the same name that are all removed. Talking about Removed["a"] is in that sense insufficient (and the example shows why an assignment like Removed["x"]=2 could not work).

ruleB = a -> 1; b = Hold[a];
Remove[a];
ruleC = a -> 2; c = Hold[a];
Remove[a];
{b, c, Hold[a]}
{b, c, Hold[a]} /. {ruleB, ruleC}
{Hold[Removed[a]],Hold[Removed[a]],Hold[a]} 
{Hold[1], Hold[2], Hold[a]}

The second example shows that assignments can be made to removed symbols and the definitions are persistent.

Remove[d]; e := d; d = 1;
{d, e}
{d,1}

Addendum 2: Remark

Of course the kernel does not check every symbol to see whether or not it is a symbol with name "f" that has not been removed. I hesitated to introduce a new name for "the unique internal symbol with name X that has not been removed" (which need not exist!), I kind of like CIS (canonical internal symbol), to be used as iF1 is the CIS of "f". The kernel must keep track of a simple lookup table with entries of the form string X -> CIS of X.

On the use of the word "removed": One could argue that we should only say that an internal symbol is removed if there are no more references to the internal symbol and memory can be cleared. One could argue that we should not say that for any internal symbol we can look up whether it has been removed or not, but rather that it has been scheduled for memory clearing or something. I decided to stick to "if Remove has been called on a internal symbol, then it is removed".


First of all, consider the following two successive inputs:

Remove[f]; f[x_] := 1; {f, Head[f], SymbolName[f], AtomQ[f], DownValues[f]}

{f, Head[f], SymbolName[f], AtomQ[f], DownValues[f]}
{Removed["f"], Symbol, "f", True, {HoldPattern[Removed["f"][x_]] :> 1}}

{f, Symbol, "f", True, {}}

From the outputs it becomes clear how the things work:

1) All the symbols used in the input line are created at the parsing stage regardless of the contents of the input expression, i.e. Remove doesn't affect this.

2) Upon evaluation Remove[f] marks the symbol f for removal but doesn't remove it. In effect it is more like renaming the symbol throughout the input expression rather than removing it (but as one can see from SymbolName, the name isn't actually changed; we just won't be able to refer to the symbol by its name in the subsequent inputs). The symbol will be actually removed at the appropriate stage of the evaluation process (in the above example - after finishing the evaluation of the first input). The symbol can't be removed if it is still used by input expression or any internal definition because otherwise they would be invalidated.

3) The print form of the symbol f marked for removal is Removed["f"], but internally it is still a symbol to which an assignment can be made. We can also see its DownValues etc.

An additional illustration for the last two points:

f := x; Remove[x];
{f, Head[f], OwnValues[f], SymbolName[f]}
Evaluate[f] = 1;                             (* <-- assigning a value to Removed["x"] *)
{f, Head[f], OwnValues[f]}
{Removed["x"], Symbol, {HoldPattern[f] :> Removed["x"]}, "x"}

{1, Integer, {HoldPattern[f] :> Removed["x"]}}