Debugging ML Models: Difference between revisions
Tags: Mobile edit Mobile web edit |
|||
| (2 intermediate revisions by the same user not shown) | |||
| Line 1: | Line 1: | ||
Notes on debugging ML models, primarilly CNNs. | Notes on debugging ML models, primarilly CNNs. | ||
Most of this is advice I've found online or gotten through mentors. | Most of this is advice I've found online or gotten through mentors or experience. | ||
==Debugging== | ==Debugging== | ||
| Line 26: | Line 26: | ||
==Overfitting== | ==Overfitting== | ||
Overfitting occurs when your training | Overfitting occurs when your model begins learning attributes specific to your training data, causing your validation loss to increase. | ||
Historically this was a big concern for ML models and people relied heavily on regularization to address overfitting. | Historically this was a big concern for ML models and people relied heavily on regularization to address overfitting. | ||
Recently though, overfitting has become less of a concern with larger ML models. | Recently though, overfitting has become less of a concern with larger ML models. | ||
| Line 65: | Line 65: | ||
* Checking that the training data has no NaNs or Infs. | * Checking that the training data has no NaNs or Infs. | ||
* Checking that there are no divides anywhere in the code or that all divides are safe. | * Checking that there are no divides anywhere in the code or that all divides are safe. | ||
** See [https://www.tensorflow.org/api_docs/python/tf/math/divide_no_nan <code>tf.math.divide_no_nan</code>]. | |||
* Checking the gradients of trig functions in the code. | * Checking the gradients of trig functions in the code. | ||