How Can I Define Only the Gradient for a Tensorflow Subgraph?
Here's a trick from Sergey Ioffe:
Suppose you want group of ops that behave as f(x) in forward mode, but as g(x) in the backward mode. You implement it as
t = g(x)
y = t + tf.stop_gradient(f(x) - t)
So in your case your g(x) could be an identity op, with a custom gradient using gradient_override_map
From TensorFlow 1.7 onward, tf.custom_gradient
is the way to go.