Jump to content

Adversarial Examples: Difference between revisions

No edit summary
 
(6 intermediate revisions by the same user not shown)
Line 1: Line 1:
An adversarial example tries to trick a neural network by applying a small worst-case perturbation to a real example.
An adversarial example tries to trick a neural network by applying a small worst-case perturbation to a real example.<br>
These were also introduced by Ian Goodfellow
These were also introduced by Ian Goodfellow.<br>
The first two papers introducing adversarial examples are:
* [https://arxiv.org/abs/1412.6572 Explaining and Harnessing Adversarial Examples] by Ian Goodfellow et al. in 2014
* [https://arxiv.org/abs/1312.6199 Intriguing properties of neural networks] by Szegedy et al. in 2014


==Attacks==
==Attacks==
===L-BFGS===
Limited memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS)<br>
This is used by Szegedy et al in their paper.
===Fast Gradient Sign Method===
===Fast Gradient Sign Method===
The fast gradient sign method (FGSM) using the sign of the gradient times a unit vector as the perturbation.
The fast gradient sign method (FGSM) uses the sign of the gradient times a unit vector as the perturbation.<br>
This was proposed by Ian Goodfellow in his paper.<br>
 
===Projected Gradient Descent===
===Projected Gradient Descent===
Basic idea: Do gradient descent. If you go too far from your example, project it back into your perturbation range.
Basic idea: Do gradient descent. If you go too far from your example, project it back into your perturbation range.<br>
This was proposed by Madry et al. in their 2017 paper [https://arxiv.org/abs/1706.06083 Towards Deep Learning Models Resistant to Adversarial Attacks].<br>


==Defenses==
==Defenses==
Most defenses focus on generating adversarial examples during training time and training on those adversarial examples.<br>
Below are some alternatives to this approach.
===Interval Bound Propagation===
===Interval Bound Propagation===
Interval Bound Propagation (IBP)<br>
Interval Bound Propagation (IBP)<br>
[https://arxiv.org/abs/1810.12715 A paper]
[https://arxiv.org/abs/1810.12715 A paper]
==NLP==
* [https://arxiv.org/abs/1901.06796 Adversarial Attacks on Deep Learning Models in Natural Language Processing: A Survey]
===Benchmark Datasets===
====Text classification====
Semantic Analysis, gender identification, grammer error detection