Neural Network Compression: Difference between revisions

Neural Network Compression (view source)

26 bytes added , 9 February 2021

5,351

edits

@@ Line 3: / Line 3: @@
 ==Pruning==
 ===Sensitivity Methods===
-The idea here is to measure how sensitive each weight (i.e. connection) or neuron is.
+The idea here is to measure how sensitive each weight/connection or neuron is.
-I.e., if you remove the neuron, how will it change the output?
+I.e. if you remove the neuron, how will it change the output?
-Typically, weights are pruned by zeroing them out and freezing them.
+Typically, weights are pruned by zeroing them out, freezing them, and fine-tuning the unfrozen weights.
 In general, the procedure is
@@ Line 17: / Line 17: @@
 * Mozer and Smolensky (1988)<ref name="mozer1988skeletonization"></ref> use a gate for each neuron. Then the sensitivity and be estimated with the derivative w.r.t the gate.
 * Karnin<ref name="karnin1990simple"></ref> estimates the sensitivity by monitoring the change in weight during training.
 * LeCun ''e al.'' present ''Optimal Brain Damage'' <ref name="lecun1989optimal"></ref> which uses the
 ===Redundancy Methods===