Difference between `apply_gradients` and `minimize` of optimizer in tensorflow
here it says minimize
uses tf.GradienTape
and then apply_gradients
:
Minimize loss by updating var_list.
This method simply computes gradient using tf.GradientTape and calls apply_gradients(). If you want to process the gradient before applying then call tf.GradientTape and apply_gradients() explicitly instead of using this function.
So minimize
actually uses apply_gradients
just like:
def minimize(self, loss, var_list, grad_loss=None, name=None, tape=None):
grads_and_vars = self._compute_gradients(loss, var_list=var_list, grad_loss=grad_loss, tape=tape)
return self.apply_gradients(grads_and_vars, name=name)
In your example, you use compute_gradients
and apply_gradients
, this is indeed valid but nowadays, compute_gradients
was made private and is therefore not good practice to use it. For this reason the function is not longer on the documentation.
You can easily know from the link : https://www.tensorflow.org/get_started/get_started (tf.train API part) that they actually do the same job. The difference it that: if you use the separated functions( tf.gradients, tf.apply_gradients), you can apply other mechanism between them, such as gradient clipping.