Debugging ML Models: Difference between revisions

Debugging ML Models (view source)

85 bytes added , 9 October 2020

5,337

edits

@@ Line 12: / Line 12: @@
 * Make sure there is no activation on the final layer.
 * If the loss is unstable or increasing, drop the learning rate to <code>O(1e-3)</code> or <code>O(1e-4)</code>.
-* Try taking the loss closer to the output of the network. E.g. if \(f\) are some transformations, do \(loss = loss\_fn(f^{-1}(gt), output)\) instead of \(loss = loss\_fn(gt, f(output))\).
+* Try taking the loss closer to the output of the network.
+** If you apply some transformations \(f\) after the output, do \(loss = loss\_fn(f^{-1}(gt), output)\) instead of \(loss = loss\_fn(gt, f(output))\).
+** This shortens the paths the gradients need to flow through.
 ** Note that this may change the per-pixel weights of the loss function.