Deep Learning: Difference between revisions

3,065 bytes added ,  29 September 2020
Line 637: Line 637:
===Formulation===
===Formulation===
Threat Models: <math>x' \in T(x, \epsilon)</math>   
Threat Models: <math>x' \in T(x, \epsilon)</math>   
Example: Lp model <math>\{x' | \Vert x' - x \Vert_{lp} \leq \epsilon }</math>
Example: Lp model <math>\{x' | \Vert x' - x \Vert_{lp} \leq \epsilon \}</math>


Non-Lp threat models:   
Non-Lp threat models:   
Line 695: Line 695:
* Inner max = attack problem (PGD)
* Inner max = attack problem (PGD)
* Outer max = gradient descent
* Outer max = gradient descent
===Other heuristic/emperical defenses===
Beginning of Lecture 9, Sept. 29 2020 
ICLR 2018: 6-7 empirical defenses
Athalye et al (ICML 2018) show that obfuscated/masked gradients provide a ''false'' sense of security.
Such defenses can be ''easily'' broken using ''adaptive'' attacks.
Obfuscated gradients provide shattered gradients (e.g. non-differentiable op) or stochastic gradients (e.g. randomized network).
False assumptions:
* One step attack > iterative attack
* Black-box at tack > white-box attack
* Unbounded attack -> success rate not equals
* Increasing attack budget doesn't increase the success rate
;How to attack defenses using gradient masking?
* Defense based on adding a ''non-differentiable'' operator:
Example 1: 
<math>\hat{f}(x) = f(g(x))</math> with g non-diff and non smooth. 
In the attack, just use g(x)
Example 2: 
Defense uses ''randomized'' classifier or stochastic gradients 
Just take the expectation over the randomization.
;Are adversarial examples inevitable? 
On sphere: <math>d_{\infty}(x, x') \leq d_{2}(x, x') \leq d_{g}(x, x')</math>. 
Classification problem: <math>\{1,..., C\} = [c]</math> labels. 
Each class has a density function <math>\rho_{c}</math> which is bounded.
;Isoperimetric Inequality
Of all closed surfaces that encloses a unit volume, the sphere has the smallest surface. 
Very intuitive but difficult to prove. (Osserman et al 1976)
;Lemma (Lerg & Pellegrino 1951)
Consider a subset <math>A \subset S^{d-1}</math>
;Lemma (Milman & Schechtman 1986)
<math>A(\epsilon) \geq 1 - (\frac{\pi}{8})^{1/2} \exp(-\frac{d-1}{2} \epsilon^2)</math>
As d goes to infinity, the right hand side goes to 1.
;Theorem 
c-class classification 
<math>\rho_c, u_c, v_c=\delta_{n-1} * u_c</math> 
<math>f_c = \mu_1 \{x \mid c(x)=c\}</math> 
Pick a class <math>f_c \leq \frac{1}{2}</math>. 
Sample x from <math>\rho_c</math>. 
Either x is misclassified or x has an <math>\epsilon</math>-adversarial example.
One of these will happen with probability <math>1-v_c (\frac{\pi}{8})^{1/2} \exp^{- ((d-1)/2) \epsilon^2}</math><br>
Proof:
Consider the region with the correct classification: <math>R=\{x \mid c(x)=c</math>. Here <math>u(R) = f_c \leq 1/2</math>. 
Consider the compliment <math>R^c</math>. <math>u_1(R^c) \geq \frac{1}{2}</math>.
;Lemma
<math>A \subset [0, 1]^d</math> with <math>vol(A) \geq \frac{1}{2}</math> 
<math>vol(A(\epsilon g d p)) \geq 1 - \frac{\exp(-2 \pi d^{1-(2/p^2)\epsilon^2})}{2 \pi \epsilon d^{1/2 - 1/p^2}}</math> 
For l2 p>=2, <math>p \geq 2</math>, <math>vol(A(\epsilon g d p)) \geq 1 - \frac{\exp(-2 \pi \epsilon^2)}{2 \pi \epsilon}</math> 
For p=2, the diameter of the hypercube is <math>O(\sqrt{d})</math>. 
For p=infinity, the diameter of the cube is 1.
This shows if you pick a random sample, there is a high probability of it being misclassified or there being an adversarial example within epsilon.
;Are adversarial examples inevitable in practice?
This is an ill-posed question. 
It depends on the data distribution, threat model, and hypothesis class.


==Misc==
==Misc==