Essential Matrix: Difference between revisions
(11 intermediate revisions by the same user not shown) | |||
Line 6: | Line 6: | ||
the essential matrix satisfies the equation <math>\mathbf{x}'^T \mathbf{E} \mathbf{x} = 0</math> | the essential matrix satisfies the equation <math>\mathbf{x}'^T \mathbf{E} \mathbf{x} = 0</math> | ||
Much of this is from [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.64.7518 An Investigation of the Essential Matrix] by Richard Hartley<ref name="hartley">[http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.64.7518 An Investigation of the Essential Matrix] by Richard Hartley</ref> | Much of this is from [http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.64.7518 An Investigation of the Essential Matrix] by Richard Hartley<ref name="hartley">[http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.64.7518 An Investigation of the Essential Matrix] by Richard Hartley</ref>. | ||
==Background and Derivation== | ==Background and Derivation== | ||
[[File: Epipolar_geometry.svg | link=Wikipedia | thumb | 400px | [[Wikipedia: Epipolar Geometriy]] ]] | [[File: Epipolar_geometry.svg | link=Wikipedia | thumb | 400px | [[Wikipedia: Epipolar Geometriy]] ]] | ||
A pinhole camera with <math>3 \times 4</math> projection matrix <math>P = K(R | A pinhole camera with <math>3 \times 4</math> projection matrix <math>P = K(R \mid -RT)</math> takes points <math>\mathbf{x} = (x, y, z)^T</math> and projects them to <math>\mathbf{u} = (u, v, w)^T = \mathbf{R}(\mathbf{x} - \mathbf{t})</math>. Here, the notation <math>(R \mid -RT)</math> represents a <math>3 \times 3</math> matrix <math>R</math> concatenated with a <math>3 \times 1</math> matrix <math>-RT</math> to form a <math>3 \times 4</math> matrix. | ||
We now consider two cameras: | We now consider two cameras: | ||
Camera 1 is at the origin of world space (or it's object space) <math>P = (I | 0)</math>. | Camera 1 is at the origin of world space (or it's object space) <math>P = (I | 0)</math>. | ||
Camera 2 is displaced with some rotation <math>R</math> and translation <math> | Camera 2 is displaced with some rotation <math>R</math> and translation <math>-RT</math>, <math>P' = (R | -RT)</math>.<br> | ||
Any point <math>\mathbf{u} = (u,v,w)^T</math> in camera 1 is represented by an epipolar line in camera 2.<br> | Any point <math>\mathbf{u} = (u,v,w)^T</math> in camera 1 is represented by an epipolar line in camera 2.<br> | ||
Under camera 2, the position of camera 1 is <math>-RT</math> and <math>P' (u,v,w,0)^T = R\mathbf{u}</math> is somewhere on this epipolar line. | Under camera 2, the position of camera 1 is <math>-RT</math> and <math>P' (u,v,w,0)^T = R\mathbf{u}</math> is somewhere on this epipolar line. | ||
Line 84: | Line 84: | ||
u'_3 u_1 \\ u'_3 u_2 \\ 1 \\ | u'_3 u_1 \\ u'_3 u_2 \\ 1 \\ | ||
\end{pmatrix} | \end{pmatrix} | ||
</math> | </math><br> | ||
Here <math>A</math> is an <math>n \times 9</math> matrix (where <math>n=8</math> if using 8 points). | |||
The goal is to minimize <math>\Vert A\mathbf{x} \Vert </math> such that <math>\Vert \mathbf{x} \Vert = 1</math> | The goal is to minimize <math>\Vert A\mathbf{x} \Vert </math> such that <math>\Vert \mathbf{x} \Vert = 1</math> | ||
Line 90: | Line 91: | ||
;Solution | ;Solution | ||
* First take the SVD of A: <math>A = UDV^T</math> | * First take the SVD of A: <math>A = UDV^T</math> | ||
** <math>U</math> is <math>8 \times 8</math>, <math>D</math> is <math>8 \times 9</math> diagonal matrix, and <math>V^T</math> is a <math>9 \times 9</math> matrix. | |||
* Now <math>x = V_j</math>, the <math>j</math>-th column of <math>V</math>. Reshape this to get <math>Q_{est}</math>. | * Now <math>x = V_j</math>, the <math>j</math>-th column of <math>V</math>. Reshape this to get <math>Q_{est}</math>. | ||
* In practice, this may not be rank 2 so we take the SVD <math>Q_{est}=U diag(r,s,t) V^T</math> | * In practice, this may not be rank 2 so we take the another SVD <math>Q_{est}=U diag(r,s,t) V^T</math> | ||
* Zero out the third singular value to get a final estimate | |||
*: <math>Q' = U diag(r,s,0) V^T</math> | *: <math>Q' = U diag(r,s,0) V^T</math> | ||
Line 120: | Line 123: | ||
* Here, <math>R</math> is your rotation and <math>S = [T]_{\times}</math> | * Here, <math>R</math> is your rotation and <math>S = [T]_{\times}</math> | ||
* <math>T = V (0, 0, 1)^T</math>, the third column of <math>V</math> or third row of <math>V^T</math> | * <math>T = V (0, 0, 1)^T</math>, the third column of <math>V</math> or third row of <math>V^T</math> | ||
** Note that this only gives you the direction of the translation. The magnitude is not determined. | |||
* Some sources such as Wikipedia use <math>[T]_{\times} = U Z U^T</math> and <math>T = U (0, 0, 1)^T</math>. | * Some sources such as Wikipedia use <math>[T]_{\times} = U Z U^T</math> and <math>T = U (0, 0, 1)^T</math>. | ||
** This is equivalent to <math>RT</math> in our notation. | ** This is equivalent to <math>RT</math> in our notation. | ||
Line 137: | Line 141: | ||
For planar images, only one of these 4 options is feasible. | For planar images, only one of these 4 options is feasible. | ||
You can determine which one is feasibly using triangulation with one of your points. | You can determine which one is feasibly using triangulation with one of your points. | ||
==3D points== | ==3D points== | ||
See [[Wikipedia: Essential_matrix]] | See [[Wikipedia: Essential_matrix]] | ||
==Fundamental Matrix== | |||
The fundamental matrix is a generalization of the essential matrix which also takes into account the calibration of the camera. | |||
==Resources== | ==Resources== | ||
* [[Wikipedia: Essential_matrix]] | * [[Wikipedia: Essential_matrix]] | ||
* [http://robotics.stanford.edu/~birch/projective/node20.html stanford essential and fundamental matricies] | |||
* [https://github.com/darknight1900/books/blob/master/Multiple%20View%20Geometry%20in%20Computer%20Vision%20(Second%20Edition).pdf Multiple View Geometry in Computer Vision by Hartley and Zisserman] | |||
==References== | ==References== |