Jump to content

Neural Network Compression: Difference between revisions

Line 3: Line 3:
==Pruning==
==Pruning==
===Sensitivity Methods===
===Sensitivity Methods===
The idea here is to measure how sensitive each weight (i.e. connection) or neuron is.   
The idea here is to measure how sensitive each weight/connection or neuron is.   
I.e., if you remove the neuron, how will it change the output?   
I.e. if you remove the neuron, how will it change the output?   
Typically, weights are pruned by zeroing them out and freezing them.
Typically, weights are pruned by zeroing them out, freezing them, and fine-tuning the unfrozen weights.


In general, the procedure is
In general, the procedure is
Line 17: Line 17:
* Mozer and Smolensky (1988)<ref name="mozer1988skeletonization"></ref> use a gate for each neuron. Then the sensitivity and be estimated with the derivative w.r.t the gate.
* Mozer and Smolensky (1988)<ref name="mozer1988skeletonization"></ref> use a gate for each neuron. Then the sensitivity and be estimated with the derivative w.r.t the gate.
* Karnin<ref name="karnin1990simple"></ref> estimates the sensitivity by monitoring the change in weight during training.
* Karnin<ref name="karnin1990simple"></ref> estimates the sensitivity by monitoring the change in weight during training.
* LeCun ''e al.'' present ''Optimal Brain Damage'' <ref name="lecun1989optimal"></ref> which uses the  
* LeCun ''e al.'' present ''Optimal Brain Damage'' <ref name="lecun1989optimal"></ref> which uses the


===Redundancy Methods===
===Redundancy Methods===