5,351
edits
Line 3: | Line 3: | ||
==Pruning== | ==Pruning== | ||
===Sensitivity Methods=== | ===Sensitivity Methods=== | ||
The idea here is to measure how sensitive each weight | The idea here is to measure how sensitive each weight/connection or neuron is. | ||
I.e. | I.e. if you remove the neuron, how will it change the output? | ||
Typically, weights are pruned by zeroing them out | Typically, weights are pruned by zeroing them out, freezing them, and fine-tuning the unfrozen weights. | ||
In general, the procedure is | In general, the procedure is | ||
Line 17: | Line 17: | ||
* Mozer and Smolensky (1988)<ref name="mozer1988skeletonization"></ref> use a gate for each neuron. Then the sensitivity and be estimated with the derivative w.r.t the gate. | * Mozer and Smolensky (1988)<ref name="mozer1988skeletonization"></ref> use a gate for each neuron. Then the sensitivity and be estimated with the derivative w.r.t the gate. | ||
* Karnin<ref name="karnin1990simple"></ref> estimates the sensitivity by monitoring the change in weight during training. | * Karnin<ref name="karnin1990simple"></ref> estimates the sensitivity by monitoring the change in weight during training. | ||
* LeCun ''e al.'' present ''Optimal Brain Damage'' <ref name="lecun1989optimal"></ref> which uses the | * LeCun ''e al.'' present ''Optimal Brain Damage'' <ref name="lecun1989optimal"></ref> which uses the | ||
===Redundancy Methods=== | ===Redundancy Methods=== |