Computer Vision: Difference between revisions
No edit summary |
|||
(2 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
Notes from the [https://www.udacity.com/course/introduction-to-computer-vision--ud810 Udacity Computer Vision Course] taught by Georgia Tech professors. | Notes from the [https://www.udacity.com/course/introduction-to-computer-vision--ud810 Udacity Computer Vision Course] taught by Georgia Tech professors. | ||
==Hough Transform== | |||
The Hough transform is a voting technique used to find things in images such as lines, circles, and arbitrary shapes. | |||
==Image Features== | |||
===Histogram of Gradients (HOG)=== | |||
See [https://www.youtube.com/watch?v=28xk5i1_7Zc]. | |||
For each image, HoG generates a feature vector for overlapping 16x16 patchs of the image. | |||
* For each 8x8 patch, compute the gradients for each pixel. Gradients will have a norm and direction. | |||
* Then bin the gradients by direction using bilinear binning (weighted voting) such that each angle will have a sum of norm (e.g. <math>\{0: x_0, 20: x_1, ..., 160: x_8\}</math>. For your 8x8 patch, <math>(x_0, ..., x_8)</math> is your feature vector or ''histogram''. This is called ''orientation binning''. | |||
* For each overlapping 16x16 patch, you have 4 8x8 patches, each with a feature vector. Concatenate all to form a 36-dim feature vector. This feature vector is then normalized with L2-norm. | |||
===SIFT=== | |||
{{main | SIFT features}} | |||
Scale Invariant Feature Transform | |||
==Segmentation== | ==Segmentation== | ||
Line 10: | Line 26: | ||
;Pros | ;Pros | ||
* Automatically finds | * Automatically finds basins of attraction. | ||
* Only one parameter: Window size for region of interest. | * Only one parameter: Window size for region of interest. | ||
* Does not assume any shape on cluster. | * Does not assume any shape on cluster. |