Debugging ML Models: Difference between revisions

← Older edit Newer edit →

@@ Line 80: / Line 80: @@
 ** For Tensorflow see [https://www.tensorflow.org/api_docs/python/tf/clip_by_norm tf.clip_by_norm] and [https://www.tensorflow.org/api_docs/python/tf/clip_by_value tf.clip_by_value].
 * Using a safe divide which forces the denominator to have values with abs > EPS.
+==Soft Operations==
+The idea of soft operations are to make sure that gradients flow through the entire network rather than one specific path.
+One example of this is softmax which allows you to apply gradients using a one-hot encoding.
+* Rather than regressing a real value <math>x</math> directly, regress a probability distribution.
+** Output scores for <math>P(x=j)</math> for some fixed set of <math>j</math>, do softmax, and take the expected value.
+** Or output <math>\mu, \sigma</math> and normalize the loss based on <math>\sigma</math>.