SURF: Speeded Up Robust Features: Difference between revisions

Revision as of 23:15, 22 April 2020

Paper

Feature Extraction

Fast-Hessian Detector

Our features will be regions in the image where the determinant of the Hessian are local maxima.

The Hessian matrix:

\(\displaystyle \mathcal{H}(\mathbf{x}, \sigma) = \begin{bmatrix} L_{xx}(\mathbf{x}, \sigma) & L_{xy}(\mathbf{x}, \sigma)\\ L_{xy}(\mathbf{x}, \sigma) & L_{yy}(\mathbf{x}, \sigma) \end{bmatrix}\)

Each entry is a convolution of a the Gaussian second order derivative with the image at \(\displaystyle \mathbf{x}\)
These convolutions are approximated using box filters on an integral image.
The approximations are denoted as \(\displaystyle D_{xx}, D_{yy}, D_{xy}\)
The determinant of the hessian is then \(\displaystyle D_{xx}D_{yy} - (0.9*D_{xy})^2\)
- 0.9 is a correction term for the approximation
  \(\displaystyle \frac{|L_{xy}(1.2)|_{F}}{|L_{xx}(1.2)|_{F}}\frac{|D_{xx}(9)|_{F}}{|D_{xy}(9)|_{F}} = 0.912\)
Interest points are local extrema of the determinant and trace of the Hessian

Scale-space representation

They can increase (e.g. double) the filter size for their approximation and to get representations at multiple scales.
They apply a "non-maximum suppression in a \(\displaystyle 3 \times 3 \times 3\) neighborhood" to "localise interest points in the image and over scales"
- Non-maximum suppression is a filtering technique to remove duplicates
  Basic idea: Let B be a set of regions. Let D be the filtered set we want to output.
  
  Pick the max confidence region from set B to D. Remove it from B.
  
  For each region in B, delete it if the IOU with selected is > threshold.
  
  See non-maximum suppression

SURF Descriptor

Orientation Assignment

Sample Haar-wavelet responses in x and y-direction at points around each feature
- Using integral images, only 6 operations are need to compute in x or y direction
  We have 6 distinct corners so we need 5 fma operations in total for each direction.
Using a 360-degree (pivoting) sliding window with radius \(\displaystyle \frac{\pi}{3}\), calculate the sum of all horizontal and vertical responses yielding vector. Note the window moves in increments of \(\displaystyle \frac{\pi}{3}\)
Pick the direction with the largest vector.

Descriptor Components

Create square regions positioned at feature points and oriented using the calculated orientation
...

Resources

Medium Introduction

@@ Line 4: / Line 4: @@
 ==Feature Extraction==
-Fast-Hessian Detector
+===Fast-Hessian Detector===
-The Hessian matrix:
+Our features will be regions in the image where the determinant of the Hessian are local maxima.
+* The Hessian matrix:
 <math>\mathcal{H}(\mathbf{x}, \sigma)
 = \begin{bmatrix}
@@ Line 12: / Line 13: @@
 L_{xy}(\mathbf{x}, \sigma) & L_{yy}(\mathbf{x}, \sigma)
 \end{bmatrix}</math>
+* Each entry is a convolution of a the Gaussian second order derivative with the image at <math>\mathbf{x}</math>
+* These convolutions are approximated using box filters on an integral image.
+*: The approximations are denoted as <math>D_{xx}, D_{yy}, D_{xy}</math>
+* The determinant of the hessian is then <math>D_{xx}D_{yy} - (0.9*D_{xy})^2</math>
+** 0.9 is a correction term for the approximation
+**: <math>\frac{|L_{xy}(1.2)|_{F}}{|L_{xx}(1.2)|_{F}}\frac{|D_{xx}(9)|_{F}}{|D_{xy}(9)|_{F}} = 0.912</math>
+* Interest points are local extrema of the determinant and trace of the Hessian
+===Scale-space representation===
+* They can increase (e.g. double) the filter size for their approximation and to get representations at multiple scales.
+* They apply a "non-maximum suppression in a <math>3 \times 3 \times 3</math> neighborhood" to "localise interest points in the image and over scales"
+** Non-maximum suppression is a filtering technique to remove duplicates
+**: Basic idea: Let B be a set of regions. Let D be the filtered set we want to output.
+**: Pick the max confidence region from set B to D. Remove it from B.
+**: For each region in B, delete it if the IOU with selected is > threshold.
+**: See [https://towardsdatascience.com/non-maximum-suppression-nms-93ce178e177c non-maximum suppression]
+==SURF Descriptor==
+===Orientation Assignment===
+* Sample Haar-wavelet responses in x and y-direction at points around each feature
+** Using integral images, only 6 operations are need to compute in x or y direction
+**: We have 6 distinct corners so we need 5 fma operations in total for each direction.
+* Using a 360-degree (pivoting) sliding window with radius <math>\frac{\pi}{3}</math>,  calculate the sum of all horizontal and vertical responses yielding vector. Note the window moves in increments of <math>\frac{\pi}{3}</math>
+* Pick the direction with the largest vector.
+===Descriptor Components===
+* Create square regions positioned at feature points and oriented using the calculated orientation
+* ...
 ==Resources==
 * [https://medium.com/data-breach/introduction-to-surf-speeded-up-robust-features-c7396d6e7c4e Medium Introduction]