## Eigenvalues vs. Singular Values

by Suraj Rampure (suraj.rampure@berkeley.edu)

In discussion for Data 100, I ignored the difference between eigenvalues and singular values. They are actually very different, and we'll look at the formal definitions of both and the relationships between them here.

## Eigenvalues

Suppose A𝕄n×n$A \in \mathbb{M}^{n \times n}$ is an n×n$n \times n$ square matrix, and vn$\textbf{v} \in \mathbb{R}^n$ is an n$n$-element column vector. Then, if we can say

Av=λv

this implies v$\textbf{v}$ is an eigenvector with corresponding eigenvalue λ$\lambda$.

Note, eigenvectors and eigenvalues exist only when A$A$ is a square matrix. Let's take a look at why.

Suppose A𝕄m×n$A \in \mathbb{M}^{m \times n}$, meaning A$A$ is a matrix with m$m$ rows and n$n$ columns. If we want to multiply A$A$ on the right by a column vector v$\textbf{v}$, that vector must have n$n$ elements in order for the dimensions of A$A$ and v$\textbf{v}$ to allow multiplication. However, when we multiply an m×n$m \times n$ matrix by a n×1$n \times 1$ vector, the result will be an m×1$m \times 1$ vector.

For example, suppose A=[21032435160]$A = \begin{bmatrix} 2 & 3 & 4 & 5 & 6 \\ 10 & 2 & -3 & 1 & 0\end{bmatrix}$ and v=120135$\textbf{v} = \begin{bmatrix} 1 \\ 2 \\ 0 \\ -1 \\ 3 \end{bmatrix} \in \mathbb{R}^5$. The result of this vector multiplication will be [2113]2$\begin{bmatrix} 21 \\ 13 \end{bmatrix} \in \mathbb{R}^2$.

The only way that Av$A\textbf{v}$ can have the same dimensions as v$\textbf{v}$ is if m=n$m = n$, i.e. A$A$ is square.

## Singular Values

To find the singular values of A$A$, we first begin by finding the eigenvalues of ATA$A^TA$. If A𝕄m×n$A \in \mathbb{M}^{m \times n}$, then ATA$A^TA$ will be an n×n$n \times n$ symmetric matrix. Since ATA$A^TA$ is square, it has eigenvalues, and furthermore, all of ATA$A^TA$'s eigenvalues will be non-negative*.

Suppose λ1,λ2,...,λn$\lambda_1, \lambda_2, ..., \lambda_n$ are the n$n$ eigenvalues of ATA$A^TA$. The singular values of A$A$, then, are σ1=λ1,σ2=λ2,...,σn=λn$\sigma_1 = \sqrt{\lambda_1}, \sigma_2 = \sqrt{\lambda_2}, ..., \sigma_n = \sqrt{\lambda_n}$. In other words, the singular values of A$A$ are the square roots of the eigenvalues of ATA$A^TA$. Notice, since λi$\lambda_i$ is non-negative, we can always take the square root.

## When are they the same?

There is a very special case in which the singular values of a matrix are the same as the eigenvalues of a matrix.

Claim: If A$A$ is a symmetric matrix, i.e. A=AT$A = A^T$, then the singular values of A$A$ are equal to the absolute values of the eigenvalues of A$A$. In other words, if A$A$ is symmetric, then if λ1,λ2,...,λn$\lambda_1, \lambda_2, ..., \lambda_n$ are the eigenvalues of A$A$, then σ1=|λ1|,σ2=|λ2|,...,σn=|λn|$\sigma_1 = |\lambda_1|, \sigma_2 = |\lambda_2|, ..., \sigma_n = |\lambda_n|$.

Proof: First, we'll show that if λ$\lambda$ is an eigenvalue of A$A$, then λ2$\lambda^2$ is an eigenvalue of A2$A^2$ for any symmetric matrix A$A$.

AvATAv=λv=ATλv=λATv=λ(Av)=λ2v

Now, we've shown that the eigenvalues of ATA$A^TA$ are of the form λ2$\lambda^2$. The singular values of A$A$ are simply the square roots of the eigenvalues of ATA$A^TA$, i.e. λ2$\sqrt{\lambda^2}$. λ$\lambda$ could have originally been negative, so we must say λ2=|λ|$\sqrt{\lambda^2} = |\lambda|$.

This proves that if λ$\lambda$ is an eigenvalue of a symmetric matrix A$A$, then |λ|$|\lambda|$ is a singular value of A$A$.

Furthermore, if A$A$ is positive semi-definite, meaning it is symmetric AND all of its eigenvalues are non-negative, we can remove the absolute value symbol, and simply state that the eigenvalues and singular values of A$A$ overlap.

## Summary

In short:

• Eigenvalues are only defined for square matrices, whereas singular values are defined for all matrices
• Even when a matrix is square, there is no direct relationship between its singular values and eigenvalues
• If a matrix is symmetric (also meaning it is square), then its eigenvalues and singular values are closely related by σ=|λ|$\sigma = |\lambda|$

For the purposes of our course, this is relevant when looking at PCA. We find the directions in which our data vary the most by determining and ranking the singular values of our data matrix A$A$ (the "directions" that we choose are actually the eigenvectors of ATA$A^TA$). All of this, of course, is done after the columns of A$A$ are de-meaned.

Look at this note if you're interested in reading more about singular values and the singular value decomposition (SVD).