Canonical view transform

The next transformation, $ T_{can}$ performs the perspective projection as described in Section 3.4; however, we must explain how it is unnaturally forced into a 4 by 4 matrix. We also want the result to be in a canonical form that appears to be unitless, which is again motivated by industrial needs. Therefore, $ T_{can}$ is called the canonical view transform. Figure 3.18 shows a viewing frustum, which is based on the four corners of a rectangular virtual screen. At $ z = n$ and $ z = f$ lie a near plane and far plane, respectively. Note that $ z < 0$ for these cases because the $ z$ axis points in the opposite direction. The virtual screen is contained in the near plane. The perspective projection should place all of the points inside of the frustum onto a virtual screen that is centered in the near plane. This implies $ d = n$ using (3.40).

We now want to reproduce (3.40) using a matrix. Consider the result of applying the following matrix multiplication:

$\displaystyle \begin{bmatrix}n & 0 & 0 & 0  0 & n & 0 & 0  0 & 0 & n & 0 \\...
... y  z  1 \end{bmatrix} = \begin{bmatrix}nx  ny  nz  z \end{bmatrix} .$ (3.42)

In the first two coordinates, we obtain the numerator of (3.40). The nonlinear part of (3.40) is the $ 1/z$ factor. To handle this, the fourth coordinate is used to represent $ z$, rather than $ 1$ as in the case of $ T_{rb}$. From this point onward, the resulting 4D vector is interpreted as a 3D vector that is scaled by dividing out its fourth component. For example, $ (v_1,v_2,v_3,v_4)$ is interpreted as

$\displaystyle (v_1/v_4, v_2/v_4, v_3/v_4) .$ (3.43)

Thus, the result from (3.42) is interpreted as

$\displaystyle (nx/z, ny/z, n) ,$ (3.44)

in which the first two coordinates match (3.42) with $ d = n$, and the third coordinate is the location of the virtual screen along the $ z$ axis.

Steven M LaValle 2020-11-11