TensorFlow Variables and Constants
In TensorFlow the differences between constants and variables are that when you declare some constant, its value can't be changed in the future (also the initialization should be with a value, not with operation).
Nevertheless, when you declare a Variable, you can change its value in the future with tf.assign() method (and the initialization can be achieved with a value or operation).
The function tf.global_variables_initializer() initialises all variables in your code with the value passed as parameter, but it works in async mode, so doesn't work properly when dependencies exists between variables.
Your first code (#1) works properly because there is no dependencies on variable initialization and the constant is constructed with a value.
The second code (#2) doesn't work because of the async behavior of tf.global_variables_initializer()
. You can fix it using tf.variables_initializer() as follows:
x = tf.Variable(35, name='x')
model_x = tf.variables_initializer([x])
y = tf.Variable(x + 5, name='y')
model_y = tf.variables_initializer([y])
with tf.Session() as session:
session.run(model_x)
session.run(model_y)
print(session.run(y))
The third code (#3) doesn't work properly because you are trying to initialize a constant with an operation, that isn't possible. To solve it, an appropriate strategy is (#1).
Regarding to your last question. You need to run (a) session.run(model)
when there are variables in your calculation graph (b) print(session.run(y))
.
I will point the difference when using eager execution.
As of Tensorflow 2.0.b1, Variables
and Constant
trigger different behaviours when using tf.GradientTape
. Strangely, the official document is not verbal about it enough.
Let's look at the example code in https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/GradientTape
x = tf.constant(3.0)
with tf.GradientTape(persistent=True) as g:
g.watch(x)
y = x * x
z = y * y
dz_dx = g.gradient(z, x) # 108.0 (4*x^3 at x = 3)
dy_dx = g.gradient(y, x) # 6.0
del g # Drop the reference to the tape
You had to watch x
which is a Constant
. GradientTape
does NOT automatically watch constants in the context. Additionally, it can watch only one tensor per GradientTape
. If you want to get gradients of multiple Constant
s, you need to nest GradientTape
s. For example,
x = tf.constant(3.0)
x2 = tf.constant(3.0)
with tf.GradientTape(persistent=True) as g:
g.watch(x)
with tf.GradientTape(persistent=True) as g2:
g2.watch(x2)
y = x * x
y2 = y * x2
dy_dx = g.gradient(y, x) # 6
dy2_dx2 = g2.gradient(y2, x2) # 9
del g, g2 # Drop the reference to the tape
On the other hand, Variable
s are automatically watched by GradientTape
.
By default GradientTape will automatically watch any trainable variables that are accessed inside the context. Source: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/GradientTape
So the above will look like,
x = tf.Variable(3.0)
x2 = tf.Variable(3.0)
with tf.GradientTape(persistent=True) as g:
y = x * x
y2 = y * x2
dy_dx = g.gradient(y, x) # 6
dy2_dx2 = g.gradient(y2, x2) # 9
del g # Drop the reference to the tape
print(dy_dx)
print(dy2_dx2)
Of course, you can turn off the automatic watching by passing watch_accessed_variables=False
. The examples may not be so practical but I hope this clears someone's confusion.