What does opt.apply_gradients() do in TensorFlow?
The update rule that the apply_gradients
method actually applies depends on the specific optimizer. Take a look at the implementation of apply_gradients
in the tf.train.Optimizer
class here. It relies on the derived classes implementing the update rule in the methods _apply_dense
and _apply_spares
. The update rule you are referring to is implemented by the GradientDescentOptimizer
.
Regarding your desired positive additive update: If what you are calling opt
is an instantiation of GradientDescentOptimizer
, then you could indeed achieve what you want to do by
grads_and_vars = opt.compute_gradients(E, [v])
eta = opt._learning_rate
my_grads_and_vars = [(g-(1/eta)*p, v) for g, v in grads_and_vars]
opt.apply_gradients(my_grads_and_vars)
The more elegant way to do this is probably to write a new optimizer (inheriting from tf.train.Optimizer
) that implements your desired update rule directly.
You can also use eager execution API.
import tensorflow as tf
tf.enable_eager_execution()
tfe = tf.contrib.eager
optimizer = tf.train.GradientDescentOptimizer(learning_rate)
grad = tfe.implicit_gradients(loss)
optimizer.apply_gradients(grad(model_fn, val_list))
I will make an instance for it as follow:
import tensorflow as tf
tf.enable_eager_exeuction()
tfe = tf.contrib.eager
W = tfe.Variable(np.random.randn())
b = tfe.Variable(np.random.randn())
def linear_regression(inputs):
return inputs * W + b;
def MSE(model_fn, inputs, labels):
return tf.reduce_sum(tf.pow(model_fn(inputs) - labels, 2)) / (2 * n_samples)
optimizer = tf.train.GradientDescentOptimizer(learning_rate = 0.001)
grad = tfe.implicit_gradients(MSE)
optimizer.apply_gradients(grad(linear_regression, train_X, train_Y)) # train_X and train_Y are your input data and label