Debugging ML Models: Difference between revisions

Line 20: Line 20:
* Train for 4x as long until the training loss and validation loss both flatten.
* Train for 4x as long until the training loss and validation loss both flatten.
* Increase or decrease the learning rate one magnitude.
* Increase or decrease the learning rate one magnitude.
* Make sure the batch size is a multiple of 2. Try increasing it to get more stable gradient updates or decreasing it to get faster iterations.
* Make sure the batch size is a multiple of 2. Try increasing it to get more stable gradient updates or decreasing it to get faster iterations with more noise.
* Try disabling any tricks you have like dropout.
* Try disabling any tricks you have like dropout.