Linear Transformations
-
A linear transformation (or linear map) is a function that takes a vector and produces another vector, while preserving addition and scaling. If $T$ is linear, then:
- $T(\mathbf{u} + \mathbf{v}) = T(\mathbf{u}) + T(\mathbf{v})$
- $T(c\mathbf{u}) = cT(\mathbf{u})$
-
Every linear transformation can be represented as multiplication by a matrix. The matrix is the transformation. When you multiply a vector by a matrix, you are applying a linear transformation to it.
-
Think of a $2 \times 2$ matrix as a machine that takes in 2D vectors and outputs new 2D vectors. The columns of the matrix tell you where the standard basis vectors $\hat{\mathbf{i}}$ and $\hat{\mathbf{j}}$ end up after the transformation. Everything else follows from linearity.
- For example, if
then $\hat{\mathbf{i}} = [1, 0]^T$ lands at $[2, 1]^T$ (column 1) and $\hat{\mathbf{j}} = [0, 1]^T$ lands at $[1, 2]^T$ (column 2). Every other vector is a combination of these two, so its output follows automatically.
-
Multiplying two matrices can be thought of as applying one transformation after another. If $B$ transforms vectors from one space and $A$ transforms the result, then $AB$ does both in sequence. In a game engine, rotating a character and then moving them forward is a different result from moving them first and then rotating, which is why matrix multiplication is not commutative.
-
Rotation turns vectors by an angle $\theta$ without changing their length. The vector stays the same size, it just points in a new direction.
- In 2D, the rotation matrix is:
- For $\theta = 90°$:
so $[1, 0]^T$ becomes $[0, 1]^T$. The vector pointing right now points up. Rotation matrices are orthogonal and always have determinant 1. When you rotate a photo on your phone, this is the exact matrix being applied to every pixel coordinate.
- In 3D, there are separate rotation matrices for each axis. A robotic arm rotates each joint around a specific axis, and each joint is one rotation matrix. Rotation around the z-axis looks like the 2D case embedded in 3D:
- Scaling stretches or shrinks vectors along each axis independently:
-
$S(2, 1.5)$ doubles the x-component and multiplies the y-component by 1.5. Scaling by $-1$ along an axis flips that component. A diagonal matrix is always a scaling transformation. When you resize an image to 50%, you are applying $S(0.5, 0.5)$ to every pixel coordinate.
-
Reflection flips vectors across an axis or line, like a mirror. Reflecting across the x-axis keeps the x-component and negates the y-component:
- For example, $[3, 2]^T$ becomes $[3, -2]^T$. When your phone flips a selfie horizontally so text reads correctly, it is applying a reflection matrix. Reflecting across the line $y = x$ swaps the two components:
-
Reflection matrices have determinant $-1$, confirming they flip orientation.
-
Rotations and reflections are both rigid transformations: they preserve distances and angles. The matrices that represent them are orthogonal matrices, which is why orthogonal matrices always have determinant $+1$ (rotation) or $-1$ (reflection).
-
Shearing skews vectors along one axis proportionally to the other. A horizontal shear by factor $k$:
-
Each point slides horizontally by $k$ times its height. With $k = 0.5$, a point at height 2 shifts right by 1. The bottom row stays put, the top row slides. This is how italic text works: upright letters are sheared so they slant to the right.
-
All of the above (rotation, scaling, reflection, shearing) are linear transformations. They keep the origin fixed and preserve straight lines. But what about translation (shifting everything by a fixed amount)?
-
Translation is not a linear transformation because it moves the origin. If you shift every point right by 3, the zero vector moves to $[3, 0]^T$, breaking linearity. To handle it, we use an affine transformation, which combines a linear transformation with a translation:
$$\mathbf{y} = A\mathbf{x} + \mathbf{t}$$
- To represent this as a single matrix multiplication, we use homogeneous coordinates: add an extra 1 to every vector and use an $(n+1) \times (n+1)$ matrix:
-
Affine transformations preserve straight lines and parallelism, but not necessarily angles or lengths. Every object in a video game is positioned using affine transformations: rotate it, scale it, then place it at the right location, all encoded in a single matrix.
-
A degenerate transformation (singular matrix) collapses space into a lower dimension.
-
For example, the matrix
maps every 2D vector onto a single line, because both columns point in the same direction. The determinant is zero, information is lost, and the transformation cannot be undone.
-
Converting a colour image (3 values per pixel: red, green, blue) to grayscale (1 value per pixel) is a degenerate transformation: the colour information is permanently gone.
-
In ML, linear transformations are the core of neural networks, data is represented as a matrix (a stack of vectors representing features of an object like humans, planes, text, image...anything!)
-
Each layer applies a matrix multiplication (linear transformation), details are provided in other chapters, we need to explain hpw to structure these data and motivate neural networks properly.
-
However, the most used techniques today often almost exclusively passes the data through a bunch of linear transformations, we call these Transformers.
-
Gemini, ChatGPT, Claude, Qwen, DeepSeek and the best performing AI in the world today, are transformers!
Coding Tasks (use CoLab or notebook)
- Apply a rotation matrix to a vector and plot both the original and rotated vector. Try different angles.
import jax.numpy as jnp
import matplotlib.pyplot as plt
theta = jnp.pi / 3
R = jnp.array([[jnp.cos(theta), -jnp.sin(theta)],
[jnp.sin(theta), jnp.cos(theta)]])
v = jnp.array([1.0, 0.0])
v_rot = R @ v
plt.figure(figsize=(5, 5))
plt.quiver(0, 0, v[0], v[1], angles='xy', scale_units='xy', scale=1, color='red', label='original')
plt.quiver(0, 0, v_rot[0], v_rot[1], angles='xy', scale_units='xy', scale=1, color='blue', label='rotated')
plt.xlim(-1.5, 1.5); plt.ylim(-1.5, 1.5)
plt.grid(True); plt.legend(); plt.gca().set_aspect('equal')
plt.show()
- Apply a shearing transformation to a set of points forming a square and visualise the deformed shape.
import jax.numpy as jnp
import matplotlib.pyplot as plt
square = jnp.array([[0,0],[1,0],[1,1],[0,1],[0,0]]).T
k = 0.5
shear = jnp.array([[1, k],
[0, 1]])
sheared = shear @ square
plt.figure(figsize=(6, 4))
plt.plot(square[0], square[1], 'r-o', label='original')
plt.plot(sheared[0], sheared[1], 'b-o', label='sheared')
plt.grid(True); plt.legend(); plt.gca().set_aspect('equal')
plt.show()