Adversarial Examples: Difference between revisions

← Older edit

Adversarial Examples (view source)

Revision as of 14:42, 2 December 2019

1,522 bytes added , 2 December 2019

→‎Fast Gradient Sign Method

David

Bureaucrats, Interface administrators, Administrators

5,321

edits

@@ Line 1: / Line 1: @@
-An adversarial example tries to trick a neural network by applying a small worst-case perturbation to a real example.
+An adversarial example tries to trick a neural network by applying a small worst-case perturbation to a real example.<br>
+These were also introduced by Ian Goodfellow.<br>
+The first two papers introducing adversarial examples are:
+* [https://arxiv.org/abs/1412.6572 Explaining and Harnessing Adversarial Examples] by Ian Goodfellow et al. in 2014
+* [https://arxiv.org/abs/1312.6199 Intriguing properties of neural networks] by Szegedy et al. in 2014
+==Attacks==
+===L-BFGS===
+Limited memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS)<br>
+This is used by Szegedy et al in their paper.
+===Fast Gradient Sign Method===
+The fast gradient sign method (FGSM) uses the sign of the gradient times a unit vector as the perturbation.<br>
+This was proposed by Ian Goodfellow in his paper.<br>
+===Projected Gradient Descent===
+Basic idea: Do gradient descent. If you go too far from your example, project it back into your perturbation range.<br>
+This was proposed by Madry et al. in their 2017 paper [https://arxiv.org/abs/1706.06083 Towards Deep Learning Models Resistant to Adversarial Attacks].<br>
+==Defenses==
+Most defenses focus on generating adversarial examples during training time and training on those adversarial examples.<br>
+Below are some alternatives to this approach.
+===Interval Bound Propagation===
+Interval Bound Propagation (IBP)<br>
+[https://arxiv.org/abs/1810.12715 A paper]
+==NLP==
+* [https://arxiv.org/abs/1901.06796 Adversarial Attacks on Deep Learning Models in Natural Language Processing: A Survey]
+===Benchmark Datasets===
+====Text classification====
+Semantic Analysis, gender identification, grammer error detection