Neural Network Compression: Difference between revisions
No edit summary |
|||
Line 3: | Line 3: | ||
==Pruning== | ==Pruning== | ||
===Sensitivity Methods=== | ===Sensitivity Methods=== | ||
The idea here is to measure how sensitive each neuron is. | The idea here is to measure how sensitive each weight (i.e. connection) or neuron is. | ||
I.e., if you remove the neuron, how will it change the output? | I.e., if you remove the neuron, how will it change the output? | ||
Typically, weights are pruned by zeroing them out and freezing them. | |||
In general, the procedure is | |||
# Train the network with a lot of parameters. | |||
# Compute sensitivity for each parameter. | |||
# Delete low-saliency parameters. | |||
# Continue training and repeat pruning until the number of parameters is low enough or error is too high. | |||
Sometimes, pruning can also increase accuracy and improve generalization. | |||
* Mozer and Smolensky (1988)<ref name="mozer1988skeletonization"></ref> use a gate for each neuron. Then the sensitivity and be estimated with the derivative w.r.t the gate. | * Mozer and Smolensky (1988)<ref name="mozer1988skeletonization"></ref> use a gate for each neuron. Then the sensitivity and be estimated with the derivative w.r.t the gate. | ||
* Karnin <ref name="karnin1990simple"></ref> estimates the sensitivity by monitoring the change in weight during training. | * Karnin<ref name="karnin1990simple"></ref> estimates the sensitivity by monitoring the change in weight during training. | ||
* LeCun ''e al.'' present ''Optimal Brain Damage'' <ref name="lecun1989optimal"></ref> | * LeCun ''e al.'' present ''Optimal Brain Damage'' <ref name="lecun1989optimal"></ref> which uses the | ||
===Redundancy Methods=== | |||
* Srinivas and Babu<ref name="srinivas2015data"></ref> propose a pair-wise similarity on each neuron: <math>s = \Vert a_j^2 \Vert_1 \Vert W_i - W_j \Vert^2_{2}</math> where <math>a_j</math> is the vector of weights on neuron j at the layer above and <math>W</math> are neuron weights. This combines a weight metric and a similarity metric into one sensitivity metric. When a neuron is pruned, the matrix for the current and next layers need to be updated. | |||
==Quantization== | ==Quantization== | ||
Line 19: | Line 31: | ||
* [https://pytorch.org/tutorials/intermediate/pruning_tutorial.html PyTorch pruning tutorial] | * [https://pytorch.org/tutorials/intermediate/pruning_tutorial.html PyTorch pruning tutorial] | ||
* [https://www.tensorflow.org/model_optimization/guide/pruning/pruning_with_keras#overview TF: Pruning with Keras] | * [https://www.tensorflow.org/model_optimization/guide/pruning/pruning_with_keras#overview TF: Pruning with Keras] | ||
These support magnitude-based pruning which zero out small weights. | |||
==Resources== | ==Resources== | ||
Line 29: | Line 43: | ||
<ref name="mozer1988skeletonization">Mozer, M. C., & Smolensky, P. (1988). Skeletonization: A technique for trimming the fat from a network via relevance assessment. (NeurIPS 1988). [https://proceedings.neurips.cc/paper/1988/file/07e1cd7dca89a1678042477183b7ac3f-Paper.pdf PDF]</ref> | <ref name="mozer1988skeletonization">Mozer, M. C., & Smolensky, P. (1988). Skeletonization: A technique for trimming the fat from a network via relevance assessment. (NeurIPS 1988). [https://proceedings.neurips.cc/paper/1988/file/07e1cd7dca89a1678042477183b7ac3f-Paper.pdf PDF]</ref> | ||
<ref name="karnin1990simple">Karnin, E. D. (1990). A simple procedure for pruning back-propagation trained neural networks. (IEEE TNNLS 1990). [https://ieeexplore.ieee.org/document/80236 IEEE Xplore]</ref> | <ref name="karnin1990simple">Karnin, E. D. (1990). A simple procedure for pruning back-propagation trained neural networks. (IEEE TNNLS 1990). [https://ieeexplore.ieee.org/document/80236 IEEE Xplore]</ref> | ||
<ref name="lecun1989optimal">LeCun, Y., Denker, J. S., Solla, S. A., Howard, R. E., & Jackel, L. D. (1989, November). Optimal brain damage. (NeurIPS 1989). [http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.32.7223&rep=rep1&type=pdf PDF]</ref> | |||
<ref name="srinivas2015data">Srinivas, S., & Babu, R. V. (2015). Data-free parameter pruning for deep neural networks. [https://arxiv.org/abs/1507.06149 PDF]</ref> | |||
}} | }} |