PCA - Principal Component Analysis: Step by Step Computation of PCA

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 2

PCA – Principal Component Analysis

PCA – Principal components analysis (PCA) is a dimensionality reduction technique that enables you to identify correlations
and patterns in a data set so that it can be transformed into a data set of significantly lower dimension without loss of any
important information.

Step By Step Computation Of PCA Step 2: Computing the covariance matrix


The below steps need to be followed to perform dimensionality reduction
using PCA: A covariance matrix expresses the correlation between the different variables in
1.Standardization of the data the data set. It is essential to identify heavily dependent variables because they
2.Computing the covariance matrix contain biased and redundant information which reduces the overall
3.Calculating the eigenvectors and eigenvalues performance of the model.
4.Computing the Principal Components
5.Reducing the dimensions of the data set

Step 1: Standardization of the data

Standardizing the data into a comparable range is very important.


Standardization is carried out by subtracting each value in the data from the
mean and dividing it by the overall deviation in the data set.
Step 3: Calculating the Eigenvectors and Eigenvalues
Eigenvectors and eigenvalues are the mathematical constructs that must be computed from the covariance matrix in order to determine the principal components
of the data set.

Step 4: Computing the Principal Components


Once we have computed the Eigenvectors and eigenvalues, all we have to do is order them in the descending order, where the eigenvector with the highest
eigenvalue is the most significant and thus forms the first principal component. The principal components of lesser significances can thus be removed in order to
reduce the dimensions of the data.

Step 5: Reducing the dimensions of the data set


The last step in performing PCA is to re-arrange the original data with the final principal components which represent the maximum and the most significant
information of the data set. In order to replace the original data axis with the newly formed Principal Components, you simply multiply the transpose of the
original data set by the transpose of the obtained feature vector.

Data X – Evaluating Variables Analysis


Criteria 1 Criteria 2 Criteria 3 Criteria M
PCA produces three types of analysis:
TPM/TPL 1 1 5
TPM/TPL 2 1/5 1  The Empirical orthogonal Functions(EOF)’s : The patterns and
structure in the data
TPM/TPL 3 2 0.5 1
TPM/TPL N 3.5 2.5 1  The Principal Component(PC’s): A time series, reflecting the
relative combination of each EOF

1. From the Data Matrix X given above containing data; the table is of size M  The eigen values: give the overall importance of each EOF
X N (M variables, measurements, N sample Size
2. Calculate the Covariance matrix S, based on X.
3. Solve Se = lambda e for the eigenvectors e and eigenvalues lambda(M EOF
and Eigen Values
4. Solve P = Xe to calculate the principal component( N PCs)

You might also like