Neural Network Compression: Difference between revisions

From David's Wiki
No edit summary
Line 11: Line 11:
==Factorization==
==Factorization==


==Libraries==
Both Tensorflow and PyTorch have built in libraries for pruning:
* [https://pytorch.org/tutorials/intermediate/pruning_tutorial.html PyTorch pruning tutorial]
* [https://www.tensorflow.org/model_optimization/guide/pruning/pruning_with_keras#overview TF: Pruning with Keras]


==Resources==
==Resources==

Revision as of 21:04, 2 February 2021

Brief survey on neural network compression techniques.

Pruning

Sensitivity Methods

The idea here is to measure how sensitive each neuron is.
I.e., if you remove the neuron, how will it change the output?

  • Mozer and Smolensky (1988)[1] use a gate for each neuron. Then the sensitivity and be estimated with the derivative w.r.t the gate.
  • Karnin [2] estimates the sensitivity by monitoring the change in weight during training.

Factorization

Libraries

Both Tensorflow and PyTorch have built in libraries for pruning:

Resources

Surveys

References

<templatestyles src="Reflist/styles.css" />

  1. Mozer, M. C., & Smolensky, P. (1988). Skeletonization: A technique for trimming the fat from a network via relevance assessment. (NeurIPS 1988). PDF
  2. Karnin, E. D. (1990). A simple procedure for pruning back-propagation trained neural networks. (IEEE TNNLS 1990). IEEE Xplore