Deep Learning: Difference between revisions

Line 705: Line 705:
Obfuscated gradients provide shattered gradients (e.g. non-differentiable op) or stochastic gradients (e.g. randomized network).
Obfuscated gradients provide shattered gradients (e.g. non-differentiable op) or stochastic gradients (e.g. randomized network).


False assumptions:
;How to identify obfuscated or masked gradients?
* One step attack > iterative attack
Check the following:
* Black-box at tack > white-box attack
* One step attack > iterative attack.
* Unbounded attack -> success rate not equals
* Black-box attack > white-box attack.
* Increasing attack budget doesn't increase the success rate
* Unbounded attack (i.e. <math>\rho</math> can go to infinity) does not yield 100% success rate.
* Increasing attack budget doesn't increase the success rate.
If any of the above are true then your defense may be based on obfuscated or masked gradients.


;How to attack defenses using gradient masking?
;How to attack defenses using gradient masking?
Line 722: Line 724:
Just take the expectation over the randomization.
Just take the expectation over the randomization.


;Are adversarial examples inevitable?   
===Are adversarial examples inevitable?===
;Notations
<math>S^{d-1} = \{x \in \mathbb{R} \mid \Vert x \Vert = 1\}</math>  
Let the geodesic distance be denoted by <math>d_{g}</math>.
This is the length of the shortest path on the sphere.
 
On sphere: <math>d_{\infty}(x, x') \leq d_{2}(x, x') \leq d_{g}(x, x')</math>.   
On sphere: <math>d_{\infty}(x, x') \leq d_{2}(x, x') \leq d_{g}(x, x')</math>.   
Classification problem: <math>\{1,..., C\} = [c]</math> labels.   
Classification problem: <math>\{1,..., C\} = [c]</math> labels.   
Each class has a density function <math>\rho_{c}</math> which is bounded.
Each class has a density function <math>\rho_{c}</math> which is bounded
Let <math>U_c = \sup_{x} \rho_c(x)</math> be the largest density we can get.
 
The <math>\epsilon</math>-expansion of A: 
<math>A(\epsilon, d) = \{x \mid d(x,z)\leq \epsilon \text{ for some } z \in A\}</math>.


;Isoperimetric Inequality
;Isoperimetric Inequality
Line 731: Line 742:
Very intuitive but difficult to prove. (Osserman et al 1976)
Very intuitive but difficult to prove. (Osserman et al 1976)


;Lemma (Lerg & Pellegrino 1951)
;Lemma (Lerg & Pellegrino 1951, simplified by Talagrand 1995)
Consider a subset <math>A \subset S^{d-1} \subset \mathbb{R}^n</math> with normalized measure <math>\mu_1(A) \geq 1/2</math>.
Consider a subset <math>A \subset S^{d-1} \subset \mathbb{R}^n</math> with normalized measure <math>\mu_1(A) \geq 1/2</math>.
When using the geodesic metric, the <math>\epsilon</math>-expansion <math>A(\epsilon)</math> is at least as large as the <math>\epsilon</math>-expansion of a half sphere.
Using the geodesic metric, the <math>\epsilon</math>-expansion <math>A(\epsilon)</math> is at least as large as the <math>\epsilon</math>-expansion of a half sphere.


;Lemma (Milman & Schechtman 1986)
;Lemma (Milman & Schechtman 1986)