Essential Matrix: Difference between revisions
No edit summary Tag: visualeditor-switched |
No edit summary |
||
Line 6: | Line 6: | ||
the essential matrix satisfies the equation <math>\mathbf{x}'^T \mathbf{E} \mathbf{x} = 0</math> | the essential matrix satisfies the equation <math>\mathbf{x}'^T \mathbf{E} \mathbf{x} = 0</math> | ||
==Derivation== | ==Background and Derivation== | ||
[[File: Epipolar_geometry.svg | link=Wikipedia | thumb | 400px | [[Wikipedia: Epipolar Geometriy]] ]] | |||
Much of this section is from <ref name="hartley">[http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.64.7518 An Investigation of the Essential Matrix] by Richard Hartley</ref> | |||
A pinhole camera with <math>3 \times 4</math> projection matrix <math>P = K(R | -RT)</math> takes points <math>\mathbf{x} = (x, y, z)^T</math> and projects them to <math>\mathbf{u} = (u, v, w)^T = \mathbf{R}(\mathbf{x} - \mathbf{t})</math>. | |||
We now consider two cameras: | |||
Camera 1 is at the origin of world space (or it's object space) <math>P = (I | 0)</math>. | |||
Camera 2 is displaced with some rotation <math>R</math> and translation <math>R</math>, <math>P' = (R | -RT)</math>.<br> | |||
Any point <math>\mathbf{u} = (u,v,w)^T</math> in camera 1 is represented by an epipolar line in camera 2.<br> | |||
Under camera 2, the position of camera 1 is <math>-RT</math> and <math>P' (u,v,w,0)^T = R\mathbf{u}</math> is somewhere on this epipolar line. | |||
Thus the line can be calculated by taking the cross product between the camera origin and <math>\mathbf{u}</math>. | |||
\[ | |||
(p,q,r)^T = RT \times R\mathbf{u} = R(T \times \mathbf{u}) = R[T]_{\times} \mathbf{u} | |||
\] | |||
Now the line is represented by <math>\{(u',v',w') \mid pu' + qv' + rw' = 0\}</math>, i.e. all points orthogonal to <math>(p,q,r)^T</math>. | |||
Given a vector <math>\mathbf{t}</math>, the matrix form of its cross product is:<br> | |||
<math> | |||
[\mathbf{t}]_{\times} = | |||
\begin{bmatrix} | |||
0 & -t_z & t_y\\ | |||
t_z & 0 & -t_x\\ | |||
-t_y & t_x & 0 | |||
\end{bmatrix} | |||
</math> | |||
* <math>[\mathbf{t}]_{\times} \mathbf{u} = \mathbf{t} \times \mathbf{u}</math> | |||
* This matrix is skew-symmetric. I.e. <math>[\mathbf{t}]^T_{\times} = -[\mathbf{t}]_{\times}</math> | |||
Given feature points <math>\mathbf{x}</math> and <math>\mathbf{x'}</math> from two images, | Given feature points <math>\mathbf{x}</math> and <math>\mathbf{x'}</math> from two images, | ||
we can relate them with a rotation <math>\mathbf{R}</math> and a translation <math>\mathbf{t}</math> | we can relate them with a rotation <math>\mathbf{R}</math> and a translation <math>\mathbf{t}</math> |
Revision as of 13:21, 30 April 2020
An essential matrix, denoted \(\displaystyle \mathbf{E}\), is a \(\displaystyle 3 \times 3\) matrix relating camera parameters.
You can compute the essential matrix based on features matches between two images.
Using the essential matrix, you can extract the relative rotation and translation between two cameras.
Given feature points \(\displaystyle \mathbf{x}\) and \(\displaystyle \mathbf{x'}\) from two images, the essential matrix satisfies the equation \(\displaystyle \mathbf{x}'^T \mathbf{E} \mathbf{x} = 0\)
Background and Derivation
Much of this section is from [1]
A pinhole camera with \(\displaystyle 3 \times 4\) projection matrix \(\displaystyle P = K(R | -RT)\) takes points \(\displaystyle \mathbf{x} = (x, y, z)^T\) and projects them to \(\displaystyle \mathbf{u} = (u, v, w)^T = \mathbf{R}(\mathbf{x} - \mathbf{t})\).
We now consider two cameras:
Camera 1 is at the origin of world space (or it's object space) \(\displaystyle P = (I | 0)\).
Camera 2 is displaced with some rotation \(\displaystyle R\) and translation \(\displaystyle R\), \(\displaystyle P' = (R | -RT)\).
Any point \(\displaystyle \mathbf{u} = (u,v,w)^T\) in camera 1 is represented by an epipolar line in camera 2.
Under camera 2, the position of camera 1 is \(\displaystyle -RT\) and \(\displaystyle P' (u,v,w,0)^T = R\mathbf{u}\) is somewhere on this epipolar line.
Thus the line can be calculated by taking the cross product between the camera origin and \(\displaystyle \mathbf{u}\).
\[
(p,q,r)^T = RT \times R\mathbf{u} = R(T \times \mathbf{u}) = R[T]_{\times} \mathbf{u}
\]
Now the line is represented by \(\displaystyle \{(u',v',w') \mid pu' + qv' + rw' = 0\}\), i.e. all points orthogonal to \(\displaystyle (p,q,r)^T\).
Given a vector \(\displaystyle \mathbf{t}\), the matrix form of its cross product is:
\(\displaystyle
[\mathbf{t}]_{\times} =
\begin{bmatrix}
0 & -t_z & t_y\\
t_z & 0 & -t_x\\
-t_y & t_x & 0
\end{bmatrix}
\)
- \(\displaystyle [\mathbf{t}]_{\times} \mathbf{u} = \mathbf{t} \times \mathbf{u}\)
- This matrix is skew-symmetric. I.e. \(\displaystyle [\mathbf{t}]^T_{\times} = -[\mathbf{t}]_{\times}\)
Given feature points \(\displaystyle \mathbf{x}\) and \(\displaystyle \mathbf{x'}\) from two images, we can relate them with a rotation \(\displaystyle \mathbf{R}\) and a translation \(\displaystyle \mathbf{t}\) such that \(\displaystyle (\mathbf{R}\mathbf{x} + \mathbf{t}) = x'\).
Properties
- A \(\displaystyle 3 \times 3\) matrix is an essential matrix iff two of its singular values are equal and the third value is \(\displaystyle 0\)
See Bartoli and Olsen[2].
Calculating the Essential Matrix from two images
Planar Images
Spherical Images
Here we assume an equirectangular projection.
Determining rotation \(\displaystyle \mathbf{R}\) and translation \(\displaystyle \mathbf{t}\)
3D points
See Wikipedia: Essential_matrix