Adversarial Examples: Difference between revisions

Latest revision as of 14:42, 2 December 2019

An adversarial example tries to trick a neural network by applying a small worst-case perturbation to a real example.
These were also introduced by Ian Goodfellow.
The first two papers introducing adversarial examples are:

Explaining and Harnessing Adversarial Examples by Ian Goodfellow et al. in 2014
Intriguing properties of neural networks by Szegedy et al. in 2014

Attacks

L-BFGS

Limited memory Broyden-Fletcher-Goldfarb-Shanno (L-BFGS)
This is used by Szegedy et al in their paper.

Fast Gradient Sign Method

The fast gradient sign method (FGSM) uses the sign of the gradient times a unit vector as the perturbation.
This was proposed by Ian Goodfellow in his paper.

Projected Gradient Descent

Basic idea: Do gradient descent. If you go too far from your example, project it back into your perturbation range.
This was proposed by Madry et al. in their 2017 paper Towards Deep Learning Models Resistant to Adversarial Attacks.

Defenses

Most defenses focus on generating adversarial examples during training time and training on those adversarial examples.
Below are some alternatives to this approach.

Interval Bound Propagation

Interval Bound Propagation (IBP)
A paper

NLP

Adversarial Attacks on Deep Learning Models in Natural Language Processing: A Survey

Benchmark Datasets

Text classification

Semantic Analysis, gender identification, grammer error detection

@@ Line 11: / Line 11: @@
 ===Fast Gradient Sign Method===
-The fast gradient sign method (FGSM) using the sign of the gradient times a unit vector as the perturbation.<br>
+The fast gradient sign method (FGSM) uses the sign of the gradient times a unit vector as the perturbation.<br>
 This was proposed by Ian Goodfellow in his paper.<br>
 ===Projected Gradient Descent===
 Basic idea: Do gradient descent. If you go too far from your example, project it back into your perturbation range.<br>
-This was proposed by Madry et al.<br>
+This was proposed by Madry et al. in their 2017 paper [https://arxiv.org/abs/1706.06083 Towards Deep Learning Models Resistant to Adversarial Attacks].<br>
 ==Defenses==