Created by Abhishek C. Salian
$\Huge {\underline{PRINCIPAL \,\,\,\,\,COMPONENT\,\,\,\, \, ANALYSIS \,\,\,\,\, (PCA)}}$
PCA is done to find a lower dimensional plane (i.e Surface) to project the data such that the sum of least squares of point from that plane is minimum.
The following steps will help to understand algorithm for PCA:-
We standardize features by removing the mean and scaling to unit variance To do this following is the vectorized implementation.
$\Large X_s =\Bigl(\frac{X - \mu}{\sigma}\Bigr)$
$\Large \mu = \frac{1}{N}\sum_{i=1}^N(x_i)$
$\Large \sigma =\sqrt{\frac{1}{N}\sum_{i=1}^N(x_i - \mu)^2}$
where $X_s$ is standardized feature vectors ; X is feature vectors ; $\mu$ is means ; $\sigma$ is standard deviation.
For 2 feature vector i.e $X \in R^2$ following is standardization result
For 3 feature vector i.e $X \in R^3 $ following is standardization result
We can find the eigen vectors and eigen values of covariance matrix of $X_s$ using the below formula:
$\Large Covariance\,\,\,matrix = S =$ $\Large \left[ {\begin{array}{cc} \sigma_{xx}^2 & \sigma_{xy}\\ \sigma_{yx} & \sigma_{yy}^2\\ \end{array} } \right]$
Formula for each element of can be written as
Note:-Similar matrix can be computed for N dimensional data
$\large (S - \lambda \cdot I)X\,\,=\,\,\, 0$
When we decompose the data or signal we get the characteristic vectors of that data or signals which we call as Eigen Vectors. Eigen vectors donot get rotated they can only be stretched. Now this stretching Factor for Eigen Vectors we call it as Eigen Values
$\Large A = U\sum V^T$
Below are the steps for eigen decomposition using svd
1) Compute its transpose $A^TA$
2) Determine the eigenvalues of $A^TA$ and sort these in descending order, in the absolute sense. Square roots these to obtain the singular values of A.
3) Construct diagonal matrix S by placing singular values in descending order along its diagonal. Compute its inverse, $S^{-1}$
4) Use the ordered eigenvalues from step 2 and compute the eigenvectors of $A^TA$. Place these eigenvectors along the columns of V and compute its transpose, $V^T$
5) Compute U as $ U \,\,=\,\, AVS^{-1}$. To complete the proof, compute the full SVD using $A_r\,\, =\,\, USV^T$ for verification such that $A_r\,\,=\,\,A$
This can be directly computed using scipy's linear algebra package
Resulting Pricipal components would look like below figure
We sort eigen values in decreasing order .Simultaneous ordering of eigen vectors should be done with eigen values
Out of "n" dimension we will only select "k" dimension where k<n.(this is where we are reducing dimension by only selecting **Principal Components).
figure 1 figure 2
Above figures consist of two explained variance plot of two distribution of datas.This shows us the variance associated with each eigen values
More is the magnitude of eigen value more is the variance associated with it .Therefore, we will select only those eigen values with high variance since maximum information is associated with it
for e.g
$\lambda_1=0.76 ,\lambda_2=0.14 , \lambda_3=0.08,\lambda_4=0.02$
The above e.g is 4D data's eigen values this can be reduced to 3D by selecting only $\lambda_1, \lambda_2,\lambda_3,$ with total variance of 98%
Create a Projection matrix U_proj which will formed by stacking the Eigen Vectors corresponding to selected "K" lambdas i.e Principal Components
$\LARGE U =\left[ {\begin{array}{cc} | & | .... & |\\ u^1 & u^2 .... & u^n\\ | & | .... & | \\ \end{array} } \right]$
$\Huge \downarrow$
$\LARGE U_{proj}=\left[ {\begin{array}{cc} | & | .... & |\\ u^1 & u^2 .... & u^k\\ | & | .... & | \\ \end{array} } \right]$
Take Hadamard product between $U_{proj}$ and $X_s$
$\huge X_{pca} = X_s \cdot U_{proj}^T $
(for data1)