Debugging ML Models: Difference between revisions

Line 80: Line 80:
** For Tensorflow see [https://www.tensorflow.org/api_docs/python/tf/clip_by_norm tf.clip_by_norm] and [https://www.tensorflow.org/api_docs/python/tf/clip_by_value tf.clip_by_value].
** For Tensorflow see [https://www.tensorflow.org/api_docs/python/tf/clip_by_norm tf.clip_by_norm] and [https://www.tensorflow.org/api_docs/python/tf/clip_by_value tf.clip_by_value].
* Using a safe divide which forces the denominator to have values with abs > EPS.
* Using a safe divide which forces the denominator to have values with abs > EPS.
==Soft Operations==
The idea of soft operations are to make sure that gradients flow through the entire network rather than one specific path. 
One example of this is softmax which allows you to apply gradients using a one-hot encoding.
* Rather than regressing a real value <math>x</math> directly, regress a probability distribution.
** Output scores for <math>P(x=j)</math> for some fixed set of <math>j</math>, do softmax, and take the expected value.
** Or output <math>\mu, \sigma</math> and normalize the loss based on <math>\sigma</math>.