Singular Value Decomposition

11views

written 5.1 years ago by

The Singular-Value Decomposition, or SVD for short, is a matrix decomposition method for reducing a matrix to its constituent parts in order to make certain subsequent matrix calculations simpler. Singular value decomposition is a method of decomposing a matrix into three other matrices:

$A=USV^{T}$

Where:

A is an m$\times$n matrix
U is an m$\times$n orthogonal matrix
S is an n$\times$n diagonal matrix
V is an n$\times$n orthogonal matrix

SVD of real symmetric matrix:

A is real symmetric of size n$\times$n
$A=A^{T}$
U=V since $A^{T}A=AA^{T}=A^{2}$

$A=Q\sum Q^{T}$

Q is of size $n \times n$ and cootnains eigen vectors of $A^{2}$
This is called spectral decomposition of A
$\sum$ contains n singular values
Eigenvectors of A= eigenvectors of $A^{2}$
Eigen values of A=square root of eigen values of $A^{2}$
Eigen values of A=singular values of A

Transformation using SVD

Transformed data

$T=AV=U\sum$

V is called SVD transform matrix
Essentially, T is just a rotation of A
Dimensionality of T is n
n different basis vectors than the original space
columns of V give the basis vectors in rotated space
V shows how each dimension can be represented as a linear combination of other dimensions
Columns are input basis vectors
U shows how each object can be represented as a linear combination of other objects
Columns are output basis vectors
Length of vectors are presented

$\vert|\overrightarrow a_{i}\vert|_{2} =\vert|\overrightarrow t_{i}\vert|_{2}$

Example

$A\left[ \begin{matrix} 2 & 4 \\ 1 & 3 \\ 0 & 1 \\ -1 & 0.5 \end{matrix} \right] =U\begin{bmatrix} -0.80 & 0.22 & 0.05 & 0.54 \\ -0.56 & -0.20 & -0.34 & -0.71 \\ -0.16 & -0.31 & 0.90 & -0.21 \\ -0.01 & -0.89 & -0.22 & 0.37 \end{bmatrix}\times \sum \left[ \begin{matrix} 5.54 & 0 \\ 0 & 1.24 \\ 0 & 0 \\ 0 & 0 \end{matrix} \right] \times V^{ T }\begin{bmatrix} -0.39 & -0.92 \\ 0.92 & -0.39 \end{bmatrix}^T$

$T=AV=U\sum=\begin{bmatrix} 2 & 4\\ 1 & 3\\ 0 & 1\\ -1 & 0.5 \end{bmatrix} \times\begin{bmatrix} -0.39 & -0.92\\ 0.92 & -0.39\\ \end{bmatrix} =\begin{bmatrix} -4.46 & 0.27\\ -3.15 & -0.25\\ -0.92 & -0.39\\ -0.06 & -1.15 \end{bmatrix}$

Example compact form

$A=\begin{bmatrix} 2 & 4\\ 1 & 3\\ 0 & 1\\ -1 & 0.5 \end{bmatrix}=U\begin{bmatrix} -0.80 & 0.22\\ 0.56 & -0.20\\ -0.16 & -0.31\\ -0.01 & -0.89 \end{bmatrix}\times \sum\begin{bmatrix} 5.54 & 0\\ 0 & 1.24\\ \end{bmatrix}\times V^{T}\begin{bmatrix} -0.39 & -0.92\\ 0.92 & -0.39\\ \end{bmatrix}^{T}$

If A is to size m$\times$n, then U is m$\times$n, V is n$\times$n and $\sum$ is n$\times$n matrix
Works because there at most n non zero singular values in $\sum$

Dimensionality reduction using SVD

$A=U\sum V^{T}={\substack{\sum_{k=1}^{n}}}(u_{i}\sigma_{ii}v_{i}^{T})$

Use only k dimensions
Retain first k columns for U and V and first k values for $\sum$
first k columns of V give the basis vectors in reduced space
Best rank k approximation in terms of sum squared error

$A\approx \substack{\sum_{i=1}^{k}(v_{i}}\sigma_{ii}v_{i}^{T})=U_{1....k}\sum_{1...k}V^{T}_{1...K}$

$T\approx A V_{1...k}$

Example of dimensionality reduction (k=1)

$A\approx A_{k}= U_{k}\begin{bmatrix} -0.80 & 0 & 0 & 0\\ -0.56 & 0 & 0 0\\ -0.16 & 0 & 0 & 0\\ -0.01 & 0 & 0 & 0 \end{bmatrix} \times \sum_{k}\begin{bmatrix} 5.54 & 0\\ 0 & 0\\ 0 & 0\\ 0 & 0 \end{bmatrix}\times V_{k}^{T}\begin{bmatrix} -0.39 & 0\\ -0.92 & 0\\ \end{bmatrix}^{T}=\begin{bmatrix} 1.74 & 4.10\\ 1.23 & 2.90\\ 0.35 & 0.84\\ 0.02 & 0.06 \end{bmatrix}$

$T\approx T_{k}=AV_{k}=U_{k}\sum_{k}=\begin{bmatrix} 2 & 4\\ 1 & 3\\ 0 & 1\\ -1 & 0.5 \end{bmatrix}\times\begin{bmatrix} -0.39 & 0\\ 0.92 & 0\\ \end{bmatrix}=\begin{bmatrix} -4.46 & 0\\ -3.15 & 0\\ -0.92 & 0\\ -0.06 & 0 \end{bmatrix}$

Compact way of dimensionality reduction (k=1)

$A\approx A_{K}=U_{k}\begin{bmatrix} -0.80\\ -0.56\\ -0.16\\ -0.01 \end{bmatrix}\times \sum_{k}[5.54]\times V_{k}^{T}\begin{bmatrix} -0.39\\ 0.92 \end{bmatrix} ^{T}=\begin{bmatrix} 1.74 & 4.10\\ 1.23 & 2.90\\ 0.35 & 0.84\\ 0.02 & 0.06 \end{bmatrix}$

$T\approx T_{K}=AV_{k}=U_{k}\sum_{k}=\begin{bmatrix} 2 & 4\\ 1 & 3\\ 0 & 1\\ -1 & 0.5 \end{bmatrix}\times\begin{bmatrix} -0.39 \\ 0.92 \end{bmatrix}=\begin{bmatrix} -4.46\\ -3.15\\ -0.92\\ -0.06 \end{bmatrix}$

Dimensions to retain:

Concept of energy of a dataset
Total energy is sum of squares of singular value a

$E=\substack{\sum_{i=1}^{n}}\sigma_{ii}^{2}$

Retain k dimesnions such that p% of the energy is retained

$E_{k}=\substack{\sum_{i=1}^{k}}\sigma_{ii}^{2}$

$E_{k}/E\geq p$

Generally, p is between 80% to 95%

Advantages of Dimensionality Reduction

It helps in data compression, and hence reduced storage space.
It reduces computation time.
It also helps remove redundant features, if any

Disadvantages of Dimensionality Reduction

It may lead to some amount of data loss.
PCA tends to find linear correlations between variables, which is sometimes undesirable.
PCA fails in cases where mean and covariance are not enough to define datasets.
We may not know how many principal components to keep- in practice, some thumb rules are applied.

enter image description here

It is helpful in noise removal also and as result of that we can improve the performance of models.

ADD COMMENT EDIT