Computer Graphics: Difference between revisions

(41 intermediate revisions by the same user not shown)

Line 1:

Basics of Computer Graphics

==Homogeneous Coordinates==

[http://www.opengl-tutorial.org/beginners-tutorials/tutorial-3-matrices/ http://www.opengl-tutorial.org/beginners-tutorials/tutorial-3-matrices/]

Points and vectors are represented using homogeneous coordinates in computer graphics.

This allows affine transformations in 3D (i.e. rotation and translation) to be represented as a matrix multiplication.

While rotations can typically be represented in a 3x3 matrix multiplication, a translation requires a [[wikipedia:Shear mapping | ''shear'']] in 4D.

Points are <math>(x,y,z,1)</math> and vectors are <math>(x,y,z,0)</math>.

The last coordinate in points allow for translations to be represented as matrix multiplications.

;Notes

* The point <math>(kx, ky, kz, k)</math> is equivalent to <math>(x, y, z, 1)</math>.

Affine transformations consist of translations, rotations, and scaling

===Translation Matrix===

<math>

T =

\begin{bmatrix}

1 & 0 & 0 & X\\

0 & 1 & 0 & Y\\

0 & 0 & 1 & Z\\

0 & 0 & 0 & 1

\end{bmatrix}

</math>

===Rotation Matrix===

Rotations can be about the X, Y, and Z axis.

Below is a rotation about the Z axis by angle <math>\theta</math>.

<math>

R = \begin{bmatrix}

\cos(\theta) & -\sin(\theta) & 0 & 0\\

\sin(\theta) & \cos(\theta) & 0 & 0\\

0 & 0 & 1 & 0\\

0 & 0 & 0 & 1

\end{bmatrix}

</math>

To formulate a rotation about a specific axis, we use [[Wikipedia:Rodrigues' rotation formula]].

Suppose we want to rotate by angle <math>\theta</math> around axis <math>\mathbf{k}=(k_x, k_y, k_z)</math>.

Let <math>

\mathbf{K} = [\mathbf{k}]_{\times} =

\begin{bmatrix}

0 & -k_z & k_y\\

k_z & 0 & -k_x\\

-k_y & k_x & 0

\end{bmatrix}</math>

Then the rotation matrix is <math>\mathbf{R} = \mathbf{I}_{3} + (\sin \theta)\mathbf{K} + (1 - \cos \theta)\mathbf{K}^2</math>

Here the 4x4 form is:

<math>

R = \begin{bmatrix}

\mathbf{R} & \mathbf{0}\\

\mathbf{0}^T & 1

\end{bmatrix}

</math>

===Scaling Matrix===

<math>

S = \begin{bmatrix}

X & 0 & 0 & 0\\

0 & Y & 0 & 0\\

0 & 0 & Z & 0\\

0 & 0 & 0 & 1

\end{bmatrix}

</math>

===Transformation matrix===

<math>

L = T * R * S

</math>

Depending on implementation, it may be more memory-efficient or compute-efficient to represent affine transformations as their own class rather than 4x4 matrices. For example, a rotation can be represented with 3 floats in angle-axis or 4 floats in quaternion coordinates rather than a 3x3 rotation matrix.

For example, see

* [https://eigen.tuxfamily.org/dox/classEigen_1_1Transform.html Eigen::Transform]

===Barycentric Coordinates===

==MVP Matrices==

To convert from model coordinates <math>v</math> to screen coordinates <math>w</math>, you do multiply by the MVP matrices <math>w=P*V*M*v</math>

* The model matrix <math>M</math> applies the transform of your object. This includes the position and rotation. <math>M*v</math> is in world coordinates.

* The view matrix <math>V</math> applies the transform of your camera.

* The view matrix <math>V</math> applies the inverse transform of your camera. <math>V*M*v</math> is in camera or view coordinates (i.e. coordinates relative to the camera).

* The projection matrix <math>P</math> applies the projection of your camera, typically an orthographic or a perspective camera. The perspective camera shrinks objects in the distance.

===Model Matrix===

[https://gamedev.stackexchange.com/questions/16719/what-is-the-correct-order-to-multiply-scale-rotation-and-translation-matrices-f Order of matrices]

The model matrix is the product of the element's scale, rotation, and translation matrices.

===View Matrix===

Line 11:

Line 95:

[https://www.scratchapixel.com/lessons/mathematics-physics-for-computer-graphics/lookat-function Lookat function]

The view matrix is a 4x4 matrix which encodes the position and rotation of the camera.

Given a camera at position <math>p</math> looking at target <math>t~~=p-f~~</math> ~~with~~ up vector <math>u</math> ~~and~~ right vector <math>r</math>,

Given a camera at position <math>\mathbf p</math> looking at target <math>\mathbf t</math> and up vector <math>\mathbf u</math>.

~~this~~ matrix is written as:

We can calculate the forward vector (from target to position) as <math>\mathbf{f}=\mathbf{p} - \mathbf{t}</math>.

We can calculate the right vector as <math>\mathbf u \times \mathbf f</math>.

Then the view matrix is written as:

<pre>

r_x r_y r_z 0

Line 30:

Line 116:

}

</pre>

===Perspective Projection Matrix===

[https://www.scratchapixel.com/lessons/3d-basic-rendering/perspective-and-orthographic-projection-matrix/building-basic-perspective-projection-matrix https://www.scratchapixel.com/lessons/3d-basic-rendering/perspective-and-orthographic-projection-matrix/building-basic-perspective-projection-matrix]

https://www.songho.ca/opengl/gl_projectionmatrix.html

The projection matrix applies a perspective projection based on the field of view of the camera. This is done dividing the x,y view coordinates by the z-coordinate so that further object appear closer to the center. Note that the output is typically in normalized device coordinates (NDC) <math>[-1, 1]\times[-1, 1]</math> rather than image coordinates <math>{0, ..., W-1} \times {0, ..., H-1}</math>. Additionally, in NDC, the y-coordinate typically points upwards unlike image coordinates.

The Z-coordinate in the projection matrix represents a remapped version of the z-depth, i.e. depth along the camera forward axis. In OpenGL, this maps z=-f to 1 and z=-n to -1 where -z is forward.

Notes: In computer vision, this is analogous to the calibration matrix <math>K</math>.

It contains the intrinsic parameters of your pinhole camera such as field of view and focal length. The focal length determines the resolution of your output.

===Inverting the projection===

If you have the depth (either z-depth or euclidean depth), you can invert the projection operation.

The idea is to construct a ray from the camera to the pixel on a plane of the viewing frustrum and scale the distance accordingly.

See [https://gamedev.stackexchange.com/questions/108856/fast-position-reconstruction-from-depth/111885#111885 stackexchange].

==Shading==

===Flat ~~Shading~~===

===Interpolation===

===~~Gourard Shading~~===

===~~Phong Shading~~===

* Flat shading - color is computed for each face/triangle.

* Gourard shading - color is computed for each vertex and interpolated.

* Phong shading - color is computed for each pixel with the normal vector interpolated from each vertex.

===Lambert reflectance===

This is a way to model diffuse (matte) materials.

<math>I_D = (\mathbf{L} \cdot \mathbf{N}) * C * I_{L}</math>

* <math>\mathbf{N}</math> is the normal vector.

* <math>\mathbf{L}</math> is the vector to the light.

* <math>C</math> is the color.

* <math>I_{L}</math> is the intensity of light.

===Phong reflection model===

See [https://www.scratchapixel.com/lessons/3d-basic-rendering/phong-shader-BRDF scratchapixel phong shader BRDF].

This is a way to model specular (shiny) materials.

Here, the image is a linear combination of ambient, diffuse, and specular colors.

If <math>\mathbf{N}</math> is the normal vector, <math>\mathbf{V}</math> is a vector from the vertex to the viewer, <math>\mathbf{L}</math> from the light to the vertex, and <math>\mathbf{R}</math> the incident vector (i.e. <math>\mathbf{L}</math> rotated 180 around <math>\mathbf{N}</math>) then

* Ambient is a constant color for every pixel.

* The diffuse coefficient is <math>\mathbf{N} \cdot \mathbf{L}</math>.

* The specular coefficient is <math>(\mathbf{R} \cdot \mathbf{V})^n</math> where <math>n</math> is the ''shininess''.

The final color is <math>k_{ambient} * ambientColor + k_{diffuse} * (\mathbf{N} \cdot \mathbf{L}) * diffuseColor + k_{specular} * (\mathbf{R} \cdot \mathbf{V})^n * specularColor</math>.

;Notes

* The diffuse and specular components need to be computed for every visible light source.

===Physically Based===

See [https://static1.squarespace.com/static/58586fa5ebbd1a60e7d76d3e/t/593a3afa46c3c4a376d779f6/1496988449807/s2012_pbs_disney_brdf_notes_v2.pdf pbs disney brdf notes] and the [http://www.pbr-book.org/ pbr-book]

In frameworks and libraries, these are often refered to as ''standard materials'' or in Blender, ''Principled BSDF''.

==Blending and Pixel Formats==

===Pixel Formats===

===Blending===

To output transparent images, i.e. images with alpha, you'll generally want to blend using [[Premultiplied Alpha]]. Rendering in premultiplied alpha prevents your RGB color values from getting mixed with the background color empty pixels.

===Rendering===

For rasterization, the render loop typically consists of:

# Render the shadow map.

# Render all opaque objects front-to-back.

## Opaque objects write to the depth buffer.

# Render all transparent objects back-to-front

## Transparent objects do not write to the depth buffer.

Rendering opaque objects front to back minimizes overdraw, where a pixel gets drawn to multiple times in a single frame.

Rendering transparent objects back to front is needed for proper blending of transparent materials.

==Anti-aliasing==

For a high-quality anti-aliasing, you'll generally want to multiple multi-sampling (MSAA).

This causes the GPU to render the depth buffer at a higher resolution to determine the contribution of your fragment shader's color to the final image.

See https://learnopengl.com/Advanced-OpenGL/Anti-Aliasing#:~:text=How%20MSAA%20really%20works%20is,buffer%20to%20determine%20subsample%20coverage for more details.

==More Terms==

* [[Wikipedia: Diffuse reflection | Diffuse reflection]] - reflection scattered in many directions (i.e. matte)

* [[Wikipedia: Specular reflection | Specular reflection]] - mirror reflection

* [[Wikipedia: Refraction | Refraction]] - change in direction of light as it passes through a material

==Resources==

* [https://www.udacity.com/course/interactive-3d-graphics--cs291 Udacity Interactive 3D Graphics]

@@ Line 1: / Line 1: @@
 Basics of Computer Graphics
+==Homogeneous Coordinates==
+[http://www.opengl-tutorial.org/beginners-tutorials/tutorial-3-matrices/ http://www.opengl-tutorial.org/beginners-tutorials/tutorial-3-matrices/]
+Points and vectors are represented using homogeneous coordinates in computer graphics.
+This allows affine transformations in 3D (i.e. rotation and translation) to be represented as a matrix multiplication.
+While rotations can typically be represented in a 3x3 matrix multiplication, a translation requires a [[wikipedia:Shear mapping | ''shear'']] in 4D.
+Points are <math>(x,y,z,1)</math> and vectors are <math>(x,y,z,0)</math>.
+The last coordinate in points allow for translations to be represented as matrix multiplications.
+;Notes
+* The point <math>(kx, ky, kz, k)</math> is equivalent to <math>(x, y, z, 1)</math>.
+Affine transformations consist of translations, rotations, and scaling
+===Translation Matrix===
+<math>
+T =
+\begin{bmatrix}
+& 0 & 0 & X\\
+& 1 & 0 & Y\\
+& 0 & 1 & Z\\
+& 0 & 0 & 1
+\end{bmatrix}
+</math>
+===Rotation Matrix===
+Rotations can be about the X, Y, and Z axis.<br>
+Below is a rotation about the Z axis by angle <math>\theta</math>.<br>
+<math>
+R = \begin{bmatrix}
+\cos(\theta) & -\sin(\theta) & 0 & 0\\
+\sin(\theta) & \cos(\theta) & 0 & 0\\
+& 0 & 1 & 0\\
+& 0 & 0 & 1
+\end{bmatrix}
+</math>
+To formulate a rotation about a specific axis, we use [[Wikipedia:Rodrigues' rotation formula]].<br>
+Suppose we want to rotate by angle <math>\theta</math> around axis <math>\mathbf{k}=(k_x, k_y, k_z)</math>.<br>
+Let <math>
+\mathbf{K} = [\mathbf{k}]_{\times} =
+\begin{bmatrix}
+& -k_z & k_y\\
+k_z & 0 & -k_x\\
+-k_y & k_x & 0
+\end{bmatrix}</math><br>
+Then the rotation matrix is <math>\mathbf{R} = \mathbf{I}_{3} + (\sin \theta)\mathbf{K} + (1 - \cos \theta)\mathbf{K}^2</math><br>
+Here the 4x4 form is:
+<math>
+R = \begin{bmatrix}
+\mathbf{R} & \mathbf{0}\\
+\mathbf{0}^T & 1
+\end{bmatrix}
+</math>
+===Scaling Matrix===
+<math>
+S = \begin{bmatrix}
+X & 0 & 0 & 0\\
+& Y & 0 & 0\\
+& 0 & Z & 0\\
+& 0 & 0 & 1
+\end{bmatrix}
+</math>
+===Transformation matrix===
+<math>
+L = T * R * S
+</math>
+Depending on implementation, it may be more memory-efficient or compute-efficient to represent affine transformations as their own class rather than 4x4 matrices. For example, a rotation can be represented with 3 floats in angle-axis or 4 floats in quaternion coordinates rather than a 3x3 rotation matrix.
+For example, see
+* [https://eigen.tuxfamily.org/dox/classEigen_1_1Transform.html Eigen::Transform]
+===Barycentric Coordinates===
 ==MVP Matrices==
 To convert from model coordinates <math>v</math> to screen coordinates <math>w</math>, you do multiply by the MVP matrices <math>w=P*V*M*v</math>
 * The model matrix <math>M</math> applies the transform of your object. This includes the position and rotation. <math>M*v</math> is in world coordinates.
-* The view matrix <math>V</math> applies the transform of your camera.
+* The view matrix <math>V</math> applies the inverse transform of your camera. <math>V*M*v</math> is in camera or view coordinates (i.e. coordinates relative to the camera).
 * The projection matrix <math>P</math> applies the projection of your camera, typically an orthographic or a perspective camera. The perspective camera shrinks objects in the distance.
+===Model Matrix===
+[https://gamedev.stackexchange.com/questions/16719/what-is-the-correct-order-to-multiply-scale-rotation-and-translation-matrices-f Order of matrices]<br>
+The model matrix is the product of the element's scale, rotation, and translation matrices.<br>
+<math>M = T * R * S</math>
 ===View Matrix===
@@ Line 11: / Line 95: @@
 [https://www.scratchapixel.com/lessons/mathematics-physics-for-computer-graphics/lookat-function Lookat function]<br>
 The view matrix is a 4x4 matrix which encodes the position and rotation of the camera.<br>
-Given a camera at position <math>p</math> looking at target <math>t=p-f</math> with up vector <math>u</math> and right vector <math>r</math>,
+Given a camera at position <math>\mathbf p</math> looking at target <math>\mathbf t</math> and up vector <math>\mathbf u</math>.<br>
-this matrix is written as:
+We can calculate the forward vector (from target to position) as <math>\mathbf{f}=\mathbf{p} - \mathbf{t}</math>.<br>
+We can calculate the right vector as <math>\mathbf u \times \mathbf f</math>.<br>
+Then the view matrix is written as:
 <pre>
 r_x r_y r_z 0
@@ Line 30: / Line 116: @@
 }
 </pre>
+===Perspective Projection Matrix===
+[https://www.scratchapixel.com/lessons/3d-basic-rendering/perspective-and-orthographic-projection-matrix/building-basic-perspective-projection-matrix https://www.scratchapixel.com/lessons/3d-basic-rendering/perspective-and-orthographic-projection-matrix/building-basic-perspective-projection-matrix]
+https://www.songho.ca/opengl/gl_projectionmatrix.html
+The projection matrix applies a perspective projection based on the field of view of the camera. This is done dividing the x,y view coordinates by the z-coordinate so that further object appear closer to the center. Note that the output is typically in normalized device coordinates (NDC) <math>[-1, 1]\times[-1, 1]</math> rather than image coordinates <math>{0, ..., W-1} \times {0, ..., H-1}</math>. Additionally, in NDC, the y-coordinate typically points upwards unlike image coordinates.
+The Z-coordinate in the projection matrix represents a remapped version of the z-depth, i.e. depth along the camera forward axis. In OpenGL, this maps z=-f to 1 and z=-n to -1 where -z is forward.
+Notes: In computer vision, this is analogous to the calibration matrix <math>K</math>.
+It contains the intrinsic parameters of your pinhole camera such as field of view and focal length. The focal length determines the resolution of your output.
+===Inverting the projection===
+If you have the depth (either z-depth or euclidean depth), you can invert the projection operation.
+The idea is to construct a ray from the camera to the pixel on a plane of the viewing frustrum and scale the distance accordingly.
+See [https://gamedev.stackexchange.com/questions/108856/fast-position-reconstruction-from-depth/111885#111885 stackexchange].
 ==Shading==
 {{main | Wikipedia:Shading}}
-===Flat Shading===
+===Interpolation===
-===Gourard Shading===
-===Phong Shading===
+* Flat shading - color is computed for each face/triangle.
+* Gourard shading - color is computed for each vertex and interpolated.
+* Phong shading - color is computed for each pixel with the normal vector interpolated from each vertex.
+===Lambert reflectance===
+{{main | Wikipedia: Lambertian reflectance}}
+This is a way to model diffuse (matte) materials.
+<math>I_D = (\mathbf{L} \cdot \mathbf{N}) * C * I_{L}</math>
+* <math>\mathbf{N}</math> is the normal vector.
+* <math>\mathbf{L}</math> is the vector to the light.
+* <math>C</math> is the color.
+* <math>I_{L}</math> is the intensity of light.
+===Phong reflection model===
+{{main | Wikipedia: Phong reflection model}}
+See [https://www.scratchapixel.com/lessons/3d-basic-rendering/phong-shader-BRDF scratchapixel phong shader BRDF].
+This is a way to model specular (shiny) materials.
+Here, the image is a linear combination of ambient, diffuse, and specular colors.
+If <math>\mathbf{N}</math> is the normal vector, <math>\mathbf{V}</math> is a vector from the vertex to the viewer, <math>\mathbf{L}</math> from the light to the vertex, and <math>\mathbf{R}</math> the incident vector (i.e. <math>\mathbf{L}</math> rotated 180 around <math>\mathbf{N}</math>) then
+* Ambient is a constant color for every pixel.
+* The diffuse coefficient is <math>\mathbf{N} \cdot \mathbf{L}</math>.
+* The specular coefficient is <math>(\mathbf{R} \cdot \mathbf{V})^n</math> where <math>n</math> is the ''shininess''.
+The final color is <math>k_{ambient} * ambientColor + k_{diffuse} * (\mathbf{N} \cdot \mathbf{L}) * diffuseColor + k_{specular} * (\mathbf{R} \cdot \mathbf{V})^n * specularColor</math>.
+;Notes
+* The diffuse and specular components need to be computed for every visible light source.
+===Physically Based===
+See [https://static1.squarespace.com/static/58586fa5ebbd1a60e7d76d3e/t/593a3afa46c3c4a376d779f6/1496988449807/s2012_pbs_disney_brdf_notes_v2.pdf pbs disney brdf notes] and the [http://www.pbr-book.org/ pbr-book]
+In frameworks and libraries, these are often refered to as ''standard materials'' or in Blender, ''Principled BSDF''.
+==Blending and Pixel Formats==
+===Pixel Formats===
+===Blending===
+To output transparent images, i.e. images with alpha, you'll generally want to blend using [[Premultiplied Alpha]]. Rendering in premultiplied alpha prevents your RGB color values from getting mixed with the background color empty pixels.
+===Rendering===
+For rasterization, the render loop typically consists of:
+# Render the shadow map.
+# Render all opaque objects front-to-back.
+## Opaque objects write to the depth buffer.
+# Render all transparent objects back-to-front
+## Transparent objects do not write to the depth buffer.
+Rendering opaque objects front to back minimizes overdraw, where a pixel gets drawn to multiple times in a single frame.
+Rendering transparent objects back to front is needed for proper blending of transparent materials.
+==Anti-aliasing==
+For a high-quality anti-aliasing, you'll generally want to multiple multi-sampling (MSAA).
+This causes the GPU to render the depth buffer at a higher resolution to determine the contribution of your fragment shader's color to the final image.
+See https://learnopengl.com/Advanced-OpenGL/Anti-Aliasing#:~:text=How%20MSAA%20really%20works%20is,buffer%20to%20determine%20subsample%20coverage for more details.
+==More Terms==
+* [[Wikipedia: Diffuse reflection | Diffuse reflection]] - reflection scattered in many directions (i.e. matte)
+* [[Wikipedia: Specular reflection | Specular reflection]] - mirror reflection
+* [[Wikipedia: Refraction | Refraction]] - change in direction of light as it passes through a material
 ==Resources==
 * [https://www.udacity.com/course/interactive-3d-graphics--cs291 Udacity Interactive 3D Graphics]