Debugging ML Models: Difference between revisions

From David's Wiki
(Created page with "Notes on debugging ML models, primarilly CNNs. ==Debugging== * Train on a single example and see if it overfits. ** If it doesn't overfit, there may be an issue with your cod...")
 
Line 5: Line 5:
** If it doesn't overfit, there may be an issue with your code.
** If it doesn't overfit, there may be an issue with your code.
** You can try increasing the capacity (e.g number of filters or number of nodes in FC) 2-4x.
** You can try increasing the capacity (e.g number of filters or number of nodes in FC) 2-4x.
**: E.g. if the input is 3 channels, the first conv layer should have more than 3 channels.
** Check that your loss is implemented correctly and taken against the correct ground truth image.
** Check that your loss is implemented correctly and taken against the correct ground truth image.
* Dump all inputs and outputs into [[TensorBoard]]. You may have an unexpected input or output somewhere.
* Dump all inputs and outputs into [[TensorBoard]]. You may have an unexpected input or output somewhere.

Revision as of 17:57, 3 August 2020

Notes on debugging ML models, primarilly CNNs.

Debugging

  • Train on a single example and see if it overfits.
    • If it doesn't overfit, there may be an issue with your code.
    • You can try increasing the capacity (e.g number of filters or number of nodes in FC) 2-4x.
      E.g. if the input is 3 channels, the first conv layer should have more than 3 channels.
    • Check that your loss is implemented correctly and taken against the correct ground truth image.
  • Dump all inputs and outputs into TensorBoard. You may have an unexpected input or output somewhere.
  • If it looks like it is underfitting (e.g. if the training output and validation output are both blurry):
    • Train for 4x as long until the training loss and validation loss both flatten.
    • Increase or decrease the learning rate one magnitude.
    • Make sure the batch size is a multiple of 2. Try increasing it to get more stable gradient updates or decreasing it to get faster iterations.
  • Try disabling any tricks you have like dropout.
  • Make sure there is no activation on the final layer.