Ruby forgets local variables during a while loop?
From the Ruby Programming Language:
alt text http://bks0.books.google.com/books?id=jcUbTcr5XWwC&printsec=frontcover&img=1&zoom=5&sig=ACfU3U1rnYKha_p7vEkpPm1Ow3o9RAM0nQ
Blocks and Variable Scope
Blocks define a new variable scope: variables created within a block exist only within that block and are undefined outside of the block. Be cautious, however; the local variables in a method are available to any blocks within that method. So if a block assigns a value to a variable that is already defined outside of the block, this does not create a new block-local variable but instead assigns a new value to the already-existing variable. Sometimes, this is exactly the behavior we want:
total = 0
data.each {|x| total += x } # Sum the elements of the data array
puts total # Print out that sum
Sometimes, however, we do not want to alter variables in the enclosing scope, but we do so inadvertently. This problem is a particular concern for block parameters in Ruby 1.8. In Ruby 1.8, if a block parameter shares the name of an existing variable, then invocations of the block simply assign a value to that existing variable rather than creating a new block-local variable. The following code, for example, is problematic because it uses the same identifier i as the block parameter for two nested blocks:
1.upto(10) do |i| # 10 rows
1.upto(10) do |i| # Each has 10 columns
print "#{i} " # Print column number
end
print " ==> Row #{i}\n" # Try to print row number, but get column number
end
Ruby 1.9 is different: block parameters are always local to their block, and invocations of the block never assign values to existing variables. If Ruby 1.9 is invoked with the -w flag, it will warn you if a block parameter has the same name as an existing variable. This helps you avoid writing code that runs differently in 1.8 and 1.9.
Ruby 1.9 is different in another important way, too. Block syntax has been extended to allow you to declare block-local variables that are guaranteed to be local, even if a variable by the same name already exists in the enclosing scope. To do this, follow the list of block parameters with a semicolon and a comma-separated list of block local variables. Here is an example:
x = y = 0 # local variables
1.upto(4) do |x;y| # x and y are local to block
# x and y "shadow" the outer variables
y = x + 1 # Use y as a scratch variable
puts y*y # Prints 4, 9, 16, 25
end
[x,y] # => [0,0]: block does not alter these
In this code, x is a block parameter: it gets a value when the block is invoked with yield. y is a block-local variable. It does not receive any value from a yield invocation, but it has the value nil until the block actually assigns some other value to it. The point of declaring these block-local variables is to guarantee that you will not inadvertently clobber the value of some existing variable. (This might happen if a block is cut-and-pasted from one method to another, for example.) If you invoke Ruby 1.9 with the -w option, it will warn you if a block-local variable shadows an existing variable.
Blocks can have more than one parameter and more than one local variable, of course. Here is a block with two parameters and three local variables:
hash.each {|key,value; i,j,k| ... }
I think this is because message is defined inside the loop. At the end of the loop iteration "message" goes out of scope. Defining "message" outside of the loop stops the variable from going out of scope at the end of each loop iteration. So I think you have the right answer.
You could output the value of message at the beginning of each loop iteration to test whether my suggestion is correct.
Contrary to some of the other answers, while
loops don't actually create a new scope. The problem you're seeing is more subtle.
Prerequisite knowledge: a brief scoping demo
To help show the contrast, blocks passed to a method call DO create a new scope, such that a newly assigned local variable inside the block disappears after the block exits:
### block example - provided for contrast only ###
[0].each {|e| blockvar = e }
p blockvar # NameError: undefined local variable or method
But while
loops (like your case) are different, because a variable defined in the loop will persist:
arr = [0]
while arr.any?
whilevar = arr.shift
end
p whilevar # prints 0
Summary of the "problem"
The reason you get the error in your case is because the line that uses message
:
puts "#{message}"
appears before any code that assigns message
.
It's the same reason this code raises an error if a
wasn't defined beforehand:
# Note the single (not double) equal sign.
# At first glance it looks like this should print '1',
# because the 'a' is assigned before (time-wise) the puts.
puts a if a = 1
Not Scoping, but Parsing-visibility
The so-called "problem" - i.e. local-variable visibility within a single scope - is due to ruby's parser. Since we're only considering a single scope, scoping rules have nothing to do with the problem. At the parsing stage, the parser decides at which source locations a local variable is visible, and this visibility does not change during execution.
When determining if a local variable is defined (i.e. defined?
returns true) at any point in the code, the parser checks the current scope to see if any code has assigned it before, even if that code has never run (the parser can't know anything about what has or hasn't run at the parsing stage). "Before" meaning: on a line above, or on the same line and to the left-hand side.
An exercise to determine if a local is defined (i.e. visible)
Note that the following only applies to local variables, not methods. (Determining whether a method is defined in a scope is more complex, because it involves searching included modules and ancestor classes.)
A concrete way to see the local variable behavior is to open your file in a text editor. Suppose also that by repeatedly pressing the left-arrow key, you can move your cursor backward through the entire file. Now suppose you're wondering whether a certain usage of message
will raise the NameError
. To do this, position your cursor at the place you're using message
, then keep pressing left-arrow until you either:
- reach the beginning of the current scope (you must understand ruby's scoping rules in order to know when this happens)
- reach code that assigns
message
If you've reached an assignment before reaching the scope boundary, that means your usage of message
won't raise NameError
. If you don't reach any assignment, the usage will raise NameError
.
Other considerations
In the case a variable assignment appears in the code but isn't run, the variable is initialized to nil
:
# a is not defined before this
if false
# never executed, but makes the binding defined/visible to the else case
a = 1
else
p a # prints nil
end
While loop test case
Here's a small test case to demonstrate the oddness of the above behavior when it happens in a while loop. The affected variable here is dest_arr
.
arr = [0,1]
while n = arr.shift
p( n: n, dest_arr_defined: (defined? dest_arr) )
if n == 0
dest_arr = [n]
else
dest_arr << n
p( dest_arr: dest_arr )
end
end
which outputs:
{:n=>0, :dest_arr_defined=>nil}
{:n=>1, :dest_arr_defined=>nil}
{:dest_arr=>[0, 1]}
The salient points:
- The first iteration is intuitive,
dest_arr
is initialized to[0]
. - But we need to pay close attention in the second iteration (when
n
is1
):- At the beginning,
dest_arr
is undefined! - But when the code reaches the
else
case,dest_arr
is visible again, because the interpreter sees that it was assigned 2 lines above. - Notice also, that
dest_arr
is only hidden at the start of the loop; its value is never lost (as evidenced by the final contents being[0, 1]
).
- At the beginning,
This also explains why assigning your local before the while
loop fixes the problem. The assignment doesn't need to be executed; it only needs to appear in the source code.
Lambda example
f1 = ->{ f2 }
f2 = ->{ f1 }
p f2.call()
# The following fails because the body of f1 tries to access f2 before an assignment for f2 was seen by the parser.
p f1.call() # undefined local variable or method `f2'.
Fix this by putting an f2
assignment before f1
's body. Remember, the assignment doesn't actually need to be executed!
f2 = nil # Could be replaced by: if false; f2 = nil; end
f1 = ->{ f2 }
f2 = ->{ f1 }
p f2.call()
p f1.call() # ok
Method-masking gotcha
Things get really hairy if you have a local variable with the same name as a method:
def dest_arr
:whoops
end
arr = [0,1]
while n = arr.shift
p( n: n, dest_arr: dest_arr )
if n == 0
dest_arr = [n]
else
dest_arr << n
p( dest_arr: dest_arr )
end
end
Outputs:
{:n=>0, :dest_arr=>:whoops}
{:n=>1, :dest_arr=>:whoops}
{:dest_arr=>[0, 1]}
A local variable assignment in a scope will "mask"/"shadow" a method call of the same name. (You can still call the method by using explicit parentheses or an explicit receiver.) So this is similar to the previous while
loop test, except that instead of becoming undefined above the assignment code, the dest_arr
method becomes "unmasked"/"unshadowed" so that the method is callable w/o parentheses. But any code after the assignment will see the local variable.
Some best-practices we can derive from all this
- Don't name local variables the same as method names in the same scope
- Don't put the initial assignment of a local variable inside the body of a
while
orfor
loop, or anything that causes execution to jump around within a scope (calling lambdas orContinuation#call
can do this too). Put the assignment before the loop.