TensorFlow Variables and Constants

In TensorFlow the differences between constants and variables are that when you declare some constant, its value can't be changed in the future (also the initialization should be with a value, not with operation).

Nevertheless, when you declare a Variable, you can change its value in the future with tf.assign() method (and the initialization can be achieved with a value or operation).

The function tf.global_variables_initializer() initialises all variables in your code with the value passed as parameter, but it works in async mode, so doesn't work properly when dependencies exists between variables.

Your first code (#1) works properly because there is no dependencies on variable initialization and the constant is constructed with a value.

The second code (#2) doesn't work because of the async behavior of tf.global_variables_initializer(). You can fix it using tf.variables_initializer() as follows:

x = tf.Variable(35, name='x')
model_x = tf.variables_initializer([x])

y = tf.Variable(x + 5, name='y')
model_y = tf.variables_initializer([y])


with tf.Session() as session:
   session.run(model_x)
   session.run(model_y)
   print(session.run(y))

The third code (#3) doesn't work properly because you are trying to initialize a constant with an operation, that isn't possible. To solve it, an appropriate strategy is (#1).

Regarding to your last question. You need to run (a) session.run(model) when there are variables in your calculation graph (b) print(session.run(y)).


I will point the difference when using eager execution.

As of Tensorflow 2.0.b1, Variables and Constant trigger different behaviours when using tf.GradientTape. Strangely, the official document is not verbal about it enough.

Let's look at the example code in https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/GradientTape

x = tf.constant(3.0)
with tf.GradientTape(persistent=True) as g:
  g.watch(x)
  y = x * x
  z = y * y
dz_dx = g.gradient(z, x)  # 108.0 (4*x^3 at x = 3)
dy_dx = g.gradient(y, x)  # 6.0
del g  # Drop the reference to the tape

You had to watch x which is a Constant. GradientTape does NOT automatically watch constants in the context. Additionally, it can watch only one tensor per GradientTape. If you want to get gradients of multiple Constants, you need to nest GradientTapes. For example,

x = tf.constant(3.0)
x2 = tf.constant(3.0)
with tf.GradientTape(persistent=True) as g:
  g.watch(x)
  with tf.GradientTape(persistent=True) as g2:
    g2.watch(x2)

    y = x * x
    y2 = y * x2

dy_dx = g.gradient(y, x)       # 6
dy2_dx2 = g2.gradient(y2, x2)  # 9
del g, g2  # Drop the reference to the tape

On the other hand, Variables are automatically watched by GradientTape.

By default GradientTape will automatically watch any trainable variables that are accessed inside the context. Source: https://www.tensorflow.org/versions/r2.0/api_docs/python/tf/GradientTape

So the above will look like,

x = tf.Variable(3.0)
x2 = tf.Variable(3.0)
with tf.GradientTape(persistent=True) as g:
    y = x * x
    y2 = y * x2

dy_dx = g.gradient(y, x)       # 6
dy2_dx2 = g.gradient(y2, x2)   # 9
del g  # Drop the reference to the tape
print(dy_dx)
print(dy2_dx2)

Of course, you can turn off the automatic watching by passing watch_accessed_variables=False. The examples may not be so practical but I hope this clears someone's confusion.