Deep Learning: Difference between revisions

Deep Learning (view source)

Revision as of 17:17, 24 November 2020

1,735 bytes added , 24 November 2020

→‎Transformer

David

Bureaucrats, Interface administrators, Administrators

5,332

edits

@@ Line 1,919: / Line 1,919: @@
 ;Architecture
 Stack encoders.
+==Interpretability==
+;Interpretability Methods
+* Built-in model interpretability
+* Feature level interpretability
+* Instance based explanations
+We will focus on feature level interpretability.
+===Feature Level Interpretability===
+These are given through saliency maps.
+* Perturbation-based: Perturb the input to get another output and compute the difference.
+* Gradient-based
+===Gradient-based Methods===
+Take the derivative of the output with respect to the input.
+;Limitations
+* Too local and sensitive to slight perturbations
+* Saturated outputs lead to unintuitive gradients
+* Discontinuous gradients are problematic
+;SmoothGrad
+* Add gaussian noise to input and average the gradient.
+;Integrated Gradients
+* Average the gradients along path from baseline to input.
+;DeepLift
+* We don't care about gradient but the slope relative to the ''reference'' state
+;Limitations
+* Models must be able to compute the gradient of the output with respect to the input
+* Interpretation of neural networks is fragile
+** Saliency maps are uninterpretable for adversarial examples on clean models and adversarially trained models.
+* Needs white-box gradient access to the model.
+===Evaluation of interpretability methods===
+* Human evaluation
+** Can humans evaluate saliency?
+* Accuracy drop after removing ''salient'' features
+* Sanity checks
+** Model parameter randomization test - compare output of saliency method on trained vs untrained method to make sure saliency depends on model parameters.
+* Synthetic Data
+* Data randomization test
+** Train on random labels and see if saliency depends on relationship between input & output.
+Temporal saliency Rescaling
+* If you remove this feature, how is the gradient going to change.
 ==Misc==