Image Registration: Difference between revisions

From David's Wiki
 
(19 intermediate revisions by the same user not shown)
Line 1: Line 1:
Image registration is recovering an affine transformation (rotation + translation) between two images.
Image registration is recovering an affine transformation (scale, rotation, translation) between two images.
 
Image registration can be performed in the frequency domain with the Fourier transform or in the spatial domain with a log-polar transformation.


==Problem Statement==
==Problem Statement==
Line 28: Line 30:
\end{pmatrix}
\end{pmatrix}
</math>
</math>
==Phase correlation==
{{main | Wikipedia: Phase correlation}}
By the [https://en.wikipedia.org/wiki/Discrete_Fourier_transform#Shift_theorem Fourier shift theorem], a translation in the spatial domain is a linear phase shift in the Fourier domain.<br>
Hence, if you have two images with only a translation, you can identify the translation by estimating the phase shift.


==Log-Polar Transformation==
==Log-Polar Transformation==
This is copied from Wolberg and Zokai<ref name="wolberg2000robust">George Wolberg, and Siavash Zokai. ''Robust Image Registration Using Log-Polar Transform'' URL:[https://home.cis.rit.edu/~cnspci/references/wolberg2000.pdf https://home.cis.rit.edu/~cnspci/references/wolberg2000.pdf]</ref>.
This is copied from Wolberg and Zokai<ref name="wolberg2000robust"/>. See also Reddy (1996)<ref name="reddy1996fft"/>.


The log-polar transformation is defined as follows:<br>
The log-polar transformation is defined as follows:<br>
\(
\(
\begin{align}
\begin{align}
b &= \log(r) = \log\left(\sqrt{(x-x_c)^2 + (y-y_c)^2}\right)\\
\rho &= \log(r) = \log\left(\sqrt{(x-x_c)^2 + (y-y_c)^2}\right)\\
a &= \operatorname{arctan2}(y-y_c, x-x_c)
\theta &= \operatorname{arctan2}(y-y_c, x-x_c)
\end{align}
\end{align}
\)<br>
\)<br>
where <math>(x_c, y_c)</math> is the center of the image and <math>r</math> is the distance from the center of the image.
where <math>(x_c, y_c)</math> is the center of the image and <math>r</math> is the distance from the center of the image.


Here a rotation in Cartesian coordinates <math>(x, y)</math> around the center \((x_c, y_c)\) corresponds to a shift in \(a\) in log-polar coordinates.
Here a rotation in Cartesian coordinates <math>(x, y)</math> around the center \((x_c, y_c)\) corresponds to a shift in \(\theta\) in log-polar coordinates.


A scale change (i.e. enlarge or stretch) is a shift in log-space:<br>
A scale change (i.e. enlarge or stretch) is a shift in log-space:<br>
\( \lambda r \mapsto \log(\lambda r) = \log(\lambda) + \log(r) \)<br>
\( \lambda r \mapsto \log(\lambda r) = \log(\lambda) + \log(r) \)<br>


These translations can be found using [[Wikipedia: Cross-correlation]].
These translations can be found using [[Wikipedia: Cross-correlation]] or [[Wikipedia: Phase correlation]].


;Algorithm
;Algorithm
Line 58: Line 65:
## If max correlation, save \((x, y)\) and \((dx, dy)\)
## If max correlation, save \((x, y)\) and \((dx, dy)\)
# Scale = \(dx\)
# Scale = \(dx\)
# Rotation = \(dy\)
# Translation = \(x, y\)
===Adaptive Log-Polar Transformation===
The goal of Adaptive Polar Transform by Matungka ''et al.''<ref name="matungka2009adaptive"/> is to address the non-uniform sampling of the log-polar transformation.
[[File: Adaptive polar transform fig4.png |  500px | Adaptive Polar Transform]]
[[File: Adaptive polar transform fig5.png |  500px | Adaptive Polar Transform Lena]]
{{ hidden | Motivation |
Let the following:
* \(n_\rho\) be the number of samples (i.e. resolution) along the \(\rho\) axis
* \(n_\theta\) be the number of samples along the \(\theta\) axis
* \(r_i\) the radius size for pixel <math display="inline">i=1,...,n_\rho</math>
* <math display="inline">\theta = 0,...,n_\theta - 1</math>
To prevent undersampling along \(\rho\), we must have \(R_{n_\rho} - R_{n_\rho-1} \leq 1\).
I.e. the outermost two circles must have \(\leq 1\) pixel difference.<br>
To prevent undersampling along \(\theta\), we must have at least \(2\pi R_{max}\) pixels along the \(\theta\) axis.
This leads to the following equations:<br>
<math display="block">
\begin{align}
R_i &= \exp(i \times \frac{\log R_{max}}{n_\rho}), \qquad R_{max} = R_{n_\rho}\\
n_\rho &\geq \frac{\log R_{max}}{\log R_{max} - \log(R_{max}-1)}\\
n_\theta &\geq 2\pi R_{max}
\end{align}
</math>
However, using these would lead to oversampling of the fovea region wasting computation resources.
}}
Let the following:
* \(n_r\) be the number of samples (i.e. resolution) along the \(r\) axis
* \(n_\theta\) be the number of samples along the \(\theta\) axis
Given an image of size \(2 R_{max} \times 2 R_{max}\), let (R_i\) the radius size for pixel \(i=1,...,n_\rho\).
The main idea is that the number of pixels \(n_\theta\) should be adaptive to the radius:
\[
\begin{align}
n_r &= R_{max}\\
n_{\theta_i} &= R_i
\end{align}
\]
To transform an image \(I\) to adaptive polar \(IP\), do the following
<pre>
for i=1 to n_r
  for j=1 to n_{theta_i}
    IP(i,j)=I(R_max + R_i*cos(2*pi*j/n_theta_i),
              R_max + R_i*sin(2*pi*j/n_theta_i))
</pre>
The paper continues to define a projection \(\mathfrak{R}, \Theta\) to a 1-d image:
\[
\Theta(j) = \sum_{i=1}^{n_4}\left[ \eta_{ij}^1 IP\left(i, \left\lceil\frac{j-1}{\Omega_i}\right\rceil\right) + \eta^2_{ij}IP\left(i, \left\lceil\frac{j}{\Omega_i}\right\rceil\right) \right]
\]
where:
* \(i \in [1,...,n_r]\)
* \(j \in [1,...,\hat{n}_\theta\)
* \(\Omega_i = \frac{\hat{n}_0}{n_{\theta_i}}\)
* \(\eta_{ij}^1=\Omega_i(j-1) - \lfloor \Omega_i(j-1)\rfloor\)
* \(\eta_{ij}^2=1-\eta_{ij}^1\)
* \(IP(i, 0) = 0 \quad \forall i\)
Then:
* A scale change appears as ''variable-scale'' i.e. \(\mathfrak{R}_1(\lambda r)\) while \(\Theta\) is slightly altered.
* A rotation in the cartesian image appears as a phase-shift in \(\Theta\).<br>


==References==
==References==
{{reflist|refs=
<ref name="wolberg2000robust">George Wolberg, and Siavash Zokai (2000). ''Robust Image Registration Using Log-Polar Transform'' DOI: [https://doi.org/10.1109/ICIP.2000.901003 10.1109/ICIP.2000.901003] URL: [https://home.cis.rit.edu/~cnspci/references/wolberg2000.pdf https://home.cis.rit.edu/~cnspci/references/wolberg2000.pdf]</ref>
<ref name="reddy1996fft">B. S. Reddy and B. N. Chatterji, "An FFT-based technique for translation, rotation, and scale-invariant image registration," in IEEE Transactions on Image Processing, vol. 5, no. 8, pp. 1266-1271, Aug. 1996, doi: 10.1109/83.506761. https://ieeexplore.ieee.org/document/506761</ref>
<ref name="matungka2009adaptive"><cite class="journal">Rittavee Matungka, Yuan F. Zheng, and Robert L. Ewing (2009). ''Image Registration Using Adaptive Polar Transform'' DOI: [https://doi.org/10.1109/TIP.2009.2025010 10.1109/TIP.2009.2025010] URL: [https://home.cis.rit.edu/~cnspci/references/wolberg2000.pdf https://home.cis.rit.edu/~cnspci/references/wolberg2000.pdf]</cite></ref>
}}

Latest revision as of 17:28, 28 June 2023

Image registration is recovering an affine transformation (scale, rotation, translation) between two images.

Image registration can be performed in the frequency domain with the Fourier transform or in the spatial domain with a log-polar transformation.

Problem Statement

We are given two images \(\displaystyle I_1\) and \(\displaystyle I_2\).
Let \(\displaystyle (x,y)\) be uv coordinates within the image.
We want to find a rotation and translation from \(\displaystyle (x,y)\) to \(\displaystyle (x',y')\) such that \(\displaystyle I_1(x,y) = I_2(x', y')\).
This is represented as:
\( \begin{align} x' &= a_1 x + a_2 y + a_3\\ y' &= a_4 x + a_5 y + a_6 \end{align} \)
This can also be written as:
\(\displaystyle \begin{pmatrix} x' \\ y' \\ 1 \end{pmatrix} = \begin{pmatrix} a_1 & a_2 & a_3\\ a_4 & a_5 & a_6\\ 0 & 0 & 1 \end{pmatrix} \begin{pmatrix} x \\ y \\ 1 \end{pmatrix} \)

Phase correlation

By the Fourier shift theorem, a translation in the spatial domain is a linear phase shift in the Fourier domain.
Hence, if you have two images with only a translation, you can identify the translation by estimating the phase shift.

Log-Polar Transformation

This is copied from Wolberg and Zokai[1]. See also Reddy (1996)[2].

The log-polar transformation is defined as follows:
\( \begin{align} \rho &= \log(r) = \log\left(\sqrt{(x-x_c)^2 + (y-y_c)^2}\right)\\ \theta &= \operatorname{arctan2}(y-y_c, x-x_c) \end{align} \)
where \(\displaystyle (x_c, y_c)\) is the center of the image and \(\displaystyle r\) is the distance from the center of the image.

Here a rotation in Cartesian coordinates \(\displaystyle (x, y)\) around the center \((x_c, y_c)\) corresponds to a shift in \(\theta\) in log-polar coordinates.

A scale change (i.e. enlarge or stretch) is a shift in log-space:
\( \lambda r \mapsto \log(\lambda r) = \log(\lambda) + \log(r) \)

These translations can be found using Wikipedia: Cross-correlation or Wikipedia: Phase correlation.

Algorithm

For each resolution from coarse to fine, do the following:

  1. Crop central region \(\displaystyle I_1'\) from \(\displaystyle I_1\)
  2. Compute the low-polar transformation \(\displaystyle I_{1p}'\)
  3. For all positions \((x,y)\)
    1. Crop region \(I_{2p}'\)
    2. Compute \(I_{2p}'\)
    3. Cross-correlate \(I_{1p}'\) and \(I_{2p}'\) to get \((dx, dy)\)
    4. If max correlation, save \((x, y)\) and \((dx, dy)\)
  4. Scale = \(dx\)
  5. Rotation = \(dy\)
  6. Translation = \(x, y\)

Adaptive Log-Polar Transformation

The goal of Adaptive Polar Transform by Matungka et al.[3] is to address the non-uniform sampling of the log-polar transformation.

Adaptive Polar Transform Adaptive Polar Transform Lena

Motivation

Let the following:

  • \(n_\rho\) be the number of samples (i.e. resolution) along the \(\rho\) axis
  • \(n_\theta\) be the number of samples along the \(\theta\) axis
  • \(r_i\) the radius size for pixel \(i=1,...,n_\rho\)
  • \(\theta = 0,...,n_\theta - 1\)

To prevent undersampling along \(\rho\), we must have \(R_{n_\rho} - R_{n_\rho-1} \leq 1\). I.e. the outermost two circles must have \(\leq 1\) pixel difference.
To prevent undersampling along \(\theta\), we must have at least \(2\pi R_{max}\) pixels along the \(\theta\) axis.

This leads to the following equations:
\[ \begin{align} R_i &= \exp(i \times \frac{\log R_{max}}{n_\rho}), \qquad R_{max} = R_{n_\rho}\\ n_\rho &\geq \frac{\log R_{max}}{\log R_{max} - \log(R_{max}-1)}\\ n_\theta &\geq 2\pi R_{max} \end{align} \]

However, using these would lead to oversampling of the fovea region wasting computation resources.

Let the following:

  • \(n_r\) be the number of samples (i.e. resolution) along the \(r\) axis
  • \(n_\theta\) be the number of samples along the \(\theta\) axis

Given an image of size \(2 R_{max} \times 2 R_{max}\), let (R_i\) the radius size for pixel \(i=1,...,n_\rho\).

The main idea is that the number of pixels \(n_\theta\) should be adaptive to the radius: \[ \begin{align} n_r &= R_{max}\\ n_{\theta_i} &= R_i \end{align} \]

To transform an image \(I\) to adaptive polar \(IP\), do the following

for i=1 to n_r
  for j=1 to n_{theta_i}
    IP(i,j)=I(R_max + R_i*cos(2*pi*j/n_theta_i),
              R_max + R_i*sin(2*pi*j/n_theta_i))

The paper continues to define a projection \(\mathfrak{R}, \Theta\) to a 1-d image: \[ \Theta(j) = \sum_{i=1}^{n_4}\left[ \eta_{ij}^1 IP\left(i, \left\lceil\frac{j-1}{\Omega_i}\right\rceil\right) + \eta^2_{ij}IP\left(i, \left\lceil\frac{j}{\Omega_i}\right\rceil\right) \right] \] where:

  • \(i \in [1,...,n_r]\)
  • \(j \in [1,...,\hat{n}_\theta\)
  • \(\Omega_i = \frac{\hat{n}_0}{n_{\theta_i}}\)
  • \(\eta_{ij}^1=\Omega_i(j-1) - \lfloor \Omega_i(j-1)\rfloor\)
  • \(\eta_{ij}^2=1-\eta_{ij}^1\)
  • \(IP(i, 0) = 0 \quad \forall i\)

Then:

  • A scale change appears as variable-scale i.e. \(\mathfrak{R}_1(\lambda r)\) while \(\Theta\) is slightly altered.
  • A rotation in the cartesian image appears as a phase-shift in \(\Theta\).

References

<templatestyles src="Reflist/styles.css" />

  1. George Wolberg, and Siavash Zokai (2000). Robust Image Registration Using Log-Polar Transform DOI: 10.1109/ICIP.2000.901003 URL: https://home.cis.rit.edu/~cnspci/references/wolberg2000.pdf
  2. B. S. Reddy and B. N. Chatterji, "An FFT-based technique for translation, rotation, and scale-invariant image registration," in IEEE Transactions on Image Processing, vol. 5, no. 8, pp. 1266-1271, Aug. 1996, doi: 10.1109/83.506761. https://ieeexplore.ieee.org/document/506761
  3. Rittavee Matungka, Yuan F. Zheng, and Robert L. Ewing (2009). Image Registration Using Adaptive Polar Transform DOI: 10.1109/TIP.2009.2025010 URL: https://home.cis.rit.edu/~cnspci/references/wolberg2000.pdf