Normal equation
For \(A \in \mathbb{R}^{m \times n}\), \(b \in \mathbb{R}^m\) and \(x \in \mathbb{R}^n\), the normal equation is given by
\[ A^\top A x = A^\top b \]
Existence
There always exists a solution to the normal equation.
Proof
We want to prove that \(\text{Im}(A^\top A) = \text{Im}(A^\top)\).
First, it is clear that \(\text{Im}(A^\top A) \subset \text{Im}(A^\top)\). Then, we need to prove the equality of dimensions, which, with the property of finite-dimensional subspaces, allows us to conclude the equality of the two spaces.
By noting that \(\text{rank}(A) = \text{rank}(A^\top)\) (which can be justified by the \(J_r\) decomposition), it remains to show that \(\text{rank}(A) = \text{rank}(A^\top A)\).
First, we prove that \(\text{Ker}(A^\top A) = \text{Ker}(A)\).
It is clear that \(\text{Ker}(A) \subset \text{Ker}(A^\top A)\). Then, for \(x \in \text{Ker}(A^\top A)\), we have:
\[
\begin{align*}
A^\top A x = 0 &\Rightarrow x^\top A^\top A x = 0\\
&\Rightarrow \|Ax\|^2_2 = 0\\
&\Rightarrow Ax = 0
\end{align*}
\]
Thanks to the rank theorem, we have:
\[
n = \text{dim}(\text{Ker}(A)) + \text{rank}(A)
\]
and
\[
n = \text{dim}(\text{Ker}(A^\top A)) + \text{rank}(A^\top A)
\]
So, because \(\text{Ker}(A^\top A) = \text{Ker}(A)\), we also have the equality of dimensions and thus:
\[
\text{rank}(A) = \text{rank}(A^\top A)
\]
Finally, we have:
\[
A^\top b \in \text{Im}(A^\top) = \text{Im}(A^\top A)
\]
So, there exists \(x\) such that \(A^\top A x = A^\top b\).
Remarks
- \(x\) is a solution to the normal equation if and only if it minimizes the least square error \( \| Ax - b \|_2^2 \).
- For a full rank matrix \(A\), there exists a unique solution: \[ x = (A^\top A)^{-1} A^\top b \]
- \( x = (A^\top A)^+ A^\top b \) belongs to the set of solutions and minimizes the norm of \(x\).
Acknowledgment: Some insights and inspiration were taken from discussions with Maël Chaumette.