Difference between `apply_gradients` and `minimize` of optimizer in tensorflow

here it says minimize uses tf.GradienTape and then apply_gradients:

Minimize loss by updating var_list.

This method simply computes gradient using tf.GradientTape and calls apply_gradients(). If you want to process the gradient before applying then call tf.GradientTape and apply_gradients() explicitly instead of using this function.

So minimize actually uses apply_gradients just like:

def minimize(self, loss, var_list, grad_loss=None, name=None, tape=None):
    grads_and_vars = self._compute_gradients(loss, var_list=var_list, grad_loss=grad_loss, tape=tape)
    return self.apply_gradients(grads_and_vars, name=name)

In your example, you use compute_gradients and apply_gradients, this is indeed valid but nowadays, compute_gradients was made private and is therefore not good practice to use it. For this reason the function is not longer on the documentation.


You can easily know from the link : https://www.tensorflow.org/get_started/get_started (tf.train API part) that they actually do the same job. The difference it that: if you use the separated functions( tf.gradients, tf.apply_gradients), you can apply other mechanism between them, such as gradient clipping.

Tags:

Tensorflow