Debugging ML Models: Difference between revisions
Line 80: | Line 80: | ||
** For Tensorflow see [https://www.tensorflow.org/api_docs/python/tf/clip_by_norm tf.clip_by_norm] and [https://www.tensorflow.org/api_docs/python/tf/clip_by_value tf.clip_by_value]. | ** For Tensorflow see [https://www.tensorflow.org/api_docs/python/tf/clip_by_norm tf.clip_by_norm] and [https://www.tensorflow.org/api_docs/python/tf/clip_by_value tf.clip_by_value]. | ||
* Using a safe divide which forces the denominator to have values with abs > EPS. | * Using a safe divide which forces the denominator to have values with abs > EPS. | ||
==Soft Operations== | |||
The idea of soft operations are to make sure that gradients flow through the entire network rather than one specific path. | |||
One example of this is softmax which allows you to apply gradients using a one-hot encoding. | |||
* Rather than regressing a real value <math>x</math> directly, regress a probability distribution. | |||
** Output scores for <math>P(x=j)</math> for some fixed set of <math>j</math>, do softmax, and take the expected value. | |||
** Or output <math>\mu, \sigma</math> and normalize the loss based on <math>\sigma</math>. |