What's the difference between optimizer.compute_gradient() and tf.gradients() in tensorflow?
optimizer.compute_gradients
wraps tf.gradients(), as you can see here. It does additional asserts (which explains your error).
I would like to add to the above answer by referring to a simple point. optimizer.compute_gradients
return a list of tuples as (grads, vars) pairs. Variables are always there, but the gradients might be None. That makes sense since computing the gradients of specific loss
with respect to some of the variables in var_list
can be None
. It says there is no dependency.
On the other hand, tf.gradients
only return the list of sum(dy/dx)
for each variable. It MUST be accompanied by the variable list for applying the gradient update.
Henceforth, the following two approaches can be utilized interchangeably:
### Approach 1 ###
variable_list = desired_list_of_variables
gradients = optimizer.compute_gradients(loss,var_list=variable_list)
optimizer.apply_gradients(gradients)
# ### Approach 2 ###
variable_list = desired_list_of_variables
gradients = tf.gradients(loss, var_list=variable_list)
optimizer.apply_gradients(zip(gradients, variable_list))