Welcome to Scribd!

0% found this document useful (0 votes)

91 views

Principal Component Analysis - Intro - Towards Data Science

Uploaded by

PCA is a technique used to reduce the number of variables in a dataset while retaining most of the information. It works by transforming the data into a new set of variables called principal components, which account for most of the variance in the data. The principal components are linear combinations of the original variables and are extracted by analyzing the covariance matrix of the variables. Keeping only the components that explain most of the variance can reduce the dimensionality of the data.

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Principal Component Analysis - Intro - Towards Data Science

Uploaded by

Alan Picard

0% found this document useful (0 votes)

91 views4 pages

Original Description:

Principal Component Analysis

Original Title

Principal Component Analysis- Intro - Towards Data Science

Copyright

Available Formats

PDF, TXT or read online from Scribd

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

0% found this document useful (0 votes)

91 views4 pages

Principal Component Analysis - Intro - Towards Data Science

Uploaded by

Alan Picard

Copyright:

Available Formats

Download as PDF, TXT or read online from Scribd

Download as pdf or txt

Jump to Page

You are on page 1of 4

Search inside document

9/7/2019 Principal Component Analysis- Intro - Towards Data Science

Principal Component Analysis- Intro

Variable Reduction Technique
Anuja Nagpal Follow
Nov 21, 2017 · 3 min read

Too many variables? Should you be using all possible variables to generate model?

In order to handle “curse of dimensionality” and avoid issues like over-fitting in high
dimensional space, methods like Principal Component analysis is used.
https://towardsdatascience.com/principal-component-analysis-intro-61f236064b38 1/4
9/7/2019 Principal Component Analysis- Intro - Towards Data Science

PCA is a method used to reduce number of variables in your data by extracting

important one from a large pool. It reduces the dimension of your data with the aim of
retaining as much information as possible. In other words, this method combines highly
correlated variables together to form a smaller number of an artificial set of variables
which is called “principal components” that account for most variance in the data.

Let’s dive in to understand how to PCA is implemented behind the scene.

Start by normalizing the predictors by subtracting the mean from each data point. It is
important to normalize the predictor as original predictors can be on the different scale
and can contribute significantly towards variance. The result will look like table 2 with a
mean of zero.

Normalized Data

Next, calculate the covariance matrix for the data which would measure how two
predictors move together. It is measured between two predictors but if you have 3-
dimensional data (x, x1, x2), then measure the covariance between x x1, x x2, x1 x2. For
reference covariance formula is:
https://towardsdatascience.com/principal-component-analysis-intro-61f236064b38 2/4
9/7/2019 Principal Component Analysis- Intro - Towards Data Science

In our case covariance matrix would look like this:

Covariance Matrix

Now, calculate Eigen values and Eigen vector of the above matrix. This helps in finding
underlying patterns in the data. In our case it would be approximately:

Eigen Value and Vector

We are almost there :). Perform reorientation. To convert the data into new axes
multiply original data with eigenvectors, which suggests the direction of new axes. Note,
that you can choose to leave out smaller eigen vector or use both. Also, decide how many
set of features to keep based on which set accounts for 95% or more variance.

Finally, the scores calculated from above step can be plotted and and fed into the
predictive model. Plots gives us the sense of how close/highly correlated two variables

https://towardsdatascience.com/principal-component-analysis-intro-61f236064b38 3/4
9/7/2019 Principal Component Analysis- Intro - Towards Data Science

are. Instead of using original data to plot X and Y axis which doesn’t tell us much how
points are related to each other, we plot transformed data (using eigen vectors) that find
patterns and shows the relationships between points.

End Note: It is easy to confuse PCA with Factor Analysis but there is a conceptual
difference between these two methods. I will be going into details of Factor Analysis and
how it is different from PCA in my next post.. stay tuned.

Data Science Dimensionality Reduction Machine Learning Analytics

About Help Legal

https://towardsdatascience.com/principal-component-analysis-intro-61f236064b38 4/4

Logistic Regression Project With Python
Document14 pages
Logistic Regression Project With Python
Meryem Harim
No ratings yet
Cases RM
Document21 pages
Cases RM
Diana Hanafi
No ratings yet
Starbucks Coffee Report
Document26 pages
Starbucks Coffee Report
Adrian Fernandez Alvarez
100% (2)
00.053.206 SonixOne Service Manual A 140627 PDF
Document120 pages
00.053.206 SonixOne Service Manual A 140627 PDF
Jawad Sandhu
No ratings yet
20778C ENU TrainerHandbook
Document218 pages
20778C ENU TrainerHandbook
Yogendra Singh Negi
100% (2)
Network Communication Protocols Map
Document1 page
Network Communication Protocols Map
Muthurathinam
0% (2)
Vipul RPR 1
Document78 pages
Vipul RPR 1
Arjun Rana
No ratings yet
Project On Quantitative Techniques of Business Sta
Document24 pages
Project On Quantitative Techniques of Business Sta
rajat1311
No ratings yet
Student Handbook 2023 - 24 (Revised)
Document112 pages
Student Handbook 2023 - 24 (Revised)
bhavna.meghani53
No ratings yet
Synopsis On Big Bazaar
Document6 pages
Synopsis On Big Bazaar
Hemant Rajoriya
100% (2)
Online Retail Market Basket Analysis
Document51 pages
Online Retail Market Basket Analysis
shivani s sharma
No ratings yet
Coca-Cola & Pepsi (Comparitive Study Between)
Document71 pages
Coca-Cola & Pepsi (Comparitive Study Between)
fukre
No ratings yet
Retail Visibility Project of Aircel
Document89 pages
Retail Visibility Project of Aircel
abhishekkraj
100% (1)
Analysis of Retail Industry by Porter's 5 Forces
Document13 pages
Analysis of Retail Industry by Porter's 5 Forces
sunil rout
No ratings yet
Tata Motors Valuechain
Document24 pages
Tata Motors Valuechain
Nithesh Pawar
No ratings yet
KTM in India Research Methodology
Document2 pages
KTM in India Research Methodology
Himanshu Bundel
No ratings yet
Project On:: Factors Influencing Customer
Document36 pages
Project On:: Factors Influencing Customer
Ayushi Das
No ratings yet
Jio Creative Labs JD Interns
Document8 pages
Jio Creative Labs JD Interns
Rashmi Jain
No ratings yet
A Study On Customer Satisfaction Towards Bajaj Pulsar With Special Reference To Jai Bajaj Chennai
Document41 pages
A Study On Customer Satisfaction Towards Bajaj Pulsar With Special Reference To Jai Bajaj Chennai
Mayur
No ratings yet
A Study On Rural Consumer Behaviour Towards Selected Fast Moving Consumer Goods in Karur District
Document9 pages
A Study On Rural Consumer Behaviour Towards Selected Fast Moving Consumer Goods in Karur District
pecmba
No ratings yet
Finextra Fintech Marketing Report 2014 110914
Document45 pages
Finextra Fintech Marketing Report 2014 110914
Jaskeerat Singh
No ratings yet
Sip PPT Viva
Document30 pages
Sip PPT Viva
Kashvi Shah
No ratings yet
A Study of Awareness Related To Various Banking Frauds
Document50 pages
A Study of Awareness Related To Various Banking Frauds
Sakshi Singh
No ratings yet
Appendix I Questionnaire On Buying Behaviour of Car Purchasers
Document19 pages
Appendix I Questionnaire On Buying Behaviour of Car Purchasers
ranjithsteel
No ratings yet
PR GUIDELINE 2020-21 - Minor Project 1
Document20 pages
PR GUIDELINE 2020-21 - Minor Project 1
mihirchopra
No ratings yet
To Study The Impact of The Advertisements On The Brand Preference of consumers-SPSS
Document58 pages
To Study The Impact of The Advertisements On The Brand Preference of consumers-SPSS
Utkarsh Arya
No ratings yet
Project Report On Organizational Study at Shimoga Trailers and Impliments PVT
Document63 pages
Project Report On Organizational Study at Shimoga Trailers and Impliments PVT
pawan p prabhu
100% (2)
Mba Syllabus Thapar University
Document96 pages
Mba Syllabus Thapar University
Rahul Rai
No ratings yet
Dissertation Report PDF
Document13 pages
Dissertation Report PDF
Piyush V
No ratings yet
Boat Smartwatch
Document55 pages
Boat Smartwatch
WHITE DEATH
No ratings yet
Market Potential of Maruti Suzuki India
Document81 pages
Market Potential of Maruti Suzuki India
Aru.S
No ratings yet
LIC Brand Loyalty PDF
Document16 pages
LIC Brand Loyalty PDF
Rasika Pawar-Haldankar
0% (1)
Khushboo Chandak - Final SIP Report-Eli Research - Faridabad
Document85 pages
Khushboo Chandak - Final SIP Report-Eli Research - Faridabad
Prashanth_ohm
No ratings yet
Big Bazaar
Document60 pages
Big Bazaar
amriteshbhai
No ratings yet
Case Study One Plus
Document6 pages
Case Study One Plus
Arindam Rai
No ratings yet
Consumer Behavior Big Bazaar Synopsis
Document10 pages
Consumer Behavior Big Bazaar Synopsis
gogeta
No ratings yet
Practical File 3rd Sem221
Document53 pages
Practical File 3rd Sem221
Aastha Bhasin
No ratings yet
1813 Sanjeev MarketBasketAnalysis
Document45 pages
1813 Sanjeev MarketBasketAnalysis
Sayali Ketkar
0% (1)
Mahendra & Mahendra
Document111 pages
Mahendra & Mahendra
siddiqrehan
No ratings yet
A Study of Customer Satisfaction of Two Wheelers On Yamaha
Document12 pages
A Study of Customer Satisfaction of Two Wheelers On Yamaha
IOSRjournal
No ratings yet
A.Suneel Kumar - 2019JULB01278 - Tech Mahindra
Document15 pages
A.Suneel Kumar - 2019JULB01278 - Tech Mahindra
Avolupati Suneel kumar
No ratings yet
Britannia - Company Analysis
Document25 pages
Britannia - Company Analysis
Anurag
No ratings yet
Future Generali Project
Document24 pages
Future Generali Project
Sandeepa Biswas
100% (4)
Sensodyne Data Analysis Project
Document7 pages
Sensodyne Data Analysis Project
vimala
No ratings yet
Synopsis
Document7 pages
Synopsis
simpleeegl
No ratings yet
Lenkskart
Document13 pages
Lenkskart
srikar naredla
No ratings yet
04 Sports Shoes - Fitting Personality (PERS)
Document4 pages
04 Sports Shoes - Fitting Personality (PERS)
Prashant Dubey
50% (2)
Lgeil Yash
Document3 pages
Lgeil Yash
Yash Roxs
No ratings yet
Eic-Itc Kirti Sabran
Document14 pages
Eic-Itc Kirti Sabran
kirti sabran
No ratings yet
BRM Research Project
Document50 pages
BRM Research Project
Aadil Kakar
100% (1)
Jaggery Tea Goldmine
Document37 pages
Jaggery Tea Goldmine
ASHWIN RATHI
100% (1)
Attitude of Rural Customers Towards Hero Honda Bikes
Document20 pages
Attitude of Rural Customers Towards Hero Honda Bikes
Akanksha Srivastava
100% (1)
Sanjana Project Final
Document40 pages
Sanjana Project Final
Nupur Sharma
No ratings yet
2.1 The Brand: Literature Review
Document58 pages
2.1 The Brand: Literature Review
Karthi Vijay
No ratings yet
A Study On Drivers of Brand Switching Behaviour of Consumers From Jio To Airtel
Document41 pages
A Study On Drivers of Brand Switching Behaviour of Consumers From Jio To Airtel
Nagarjuna Reddy
No ratings yet
An Internship Report On Customer Satisfaction of E-Commerce in Bangladesh
Document51 pages
An Internship Report On Customer Satisfaction of E-Commerce in Bangladesh
akash rahman
No ratings yet
Case Study Analysis: Team: Zeus Thunderbolt
Document3 pages
Case Study Analysis: Team: Zeus Thunderbolt
SIDDHANT SWAIN
100% (1)
My Project
Document85 pages
My Project
MANICA
No ratings yet
3rd Progress Report
Document11 pages
3rd Progress Report
AYUSHI BISWAS
No ratings yet
Ola (India) Building Customer Loyalty To App-Based Services
Document17 pages
Ola (India) Building Customer Loyalty To App-Based Services
Dhwani Shah
No ratings yet
Green Products A Complete Guide - 2020 Edition
From Everand
Green Products A Complete Guide - 2020 Edition
Gerardus Blokdyk
Rating: 5 out of 5 stars
5/5 (1)
Touchpad Plus Ver. 1.1 Class 7
From Everand
Touchpad Plus Ver. 1.1 Class 7
Nisha Batra
No ratings yet
Surf4Joy Business Sample Answer
From Everand
Surf4Joy Business Sample Answer
AIB Publishing
No ratings yet
Compliance Management: A Holistic Approach
Document7 pages
Compliance Management: A Holistic Approach
jbascribd
100% (2)
Developing JAX-WS Web Services For Oracle WebLogic Server
Document468 pages
Developing JAX-WS Web Services For Oracle WebLogic Server
Khalidhumayun
No ratings yet
06 Ipv6 464xlat Residencial
Document42 pages
06 Ipv6 464xlat Residencial
Alexandre Zambotti Rodrigues
No ratings yet
The Cart With An Inverted Pendulum
Document5 pages
The Cart With An Inverted Pendulum
vlrsenthil
No ratings yet
Arduino and LabVIEW - All
Document10 pages
Arduino and LabVIEW - All
Solo Nunoo
No ratings yet
PROC - V19 PL302 Download
Document2 pages
PROC - V19 PL302 Download
ronfrend
No ratings yet
IBM Rational Rose ClearQuest2
Document8 pages
IBM Rational Rose ClearQuest2
api-19730622
No ratings yet
Aspire Syllabus
Document3 pages
Aspire Syllabus
Abhilash Vvg
No ratings yet
Bloomberg Blpapi Developers Guide
Document289 pages
Bloomberg Blpapi Developers Guide
davidwakyiku7922
No ratings yet
LSRetailDataDirector UserGuide
Document89 pages
LSRetailDataDirector UserGuide
Rock Lee
60% (5)
Math 100 - 1st DepEx (010811)
Document2 pages
Math 100 - 1st DepEx (010811)
keropi080
100% (1)
Social Media Dashboard: Social Media Followers Google Analytics Social Media Conversions
Document1 page
Social Media Dashboard: Social Media Followers Google Analytics Social Media Conversions
neerav2684
No ratings yet
Class 12 Physics Project File
Document16 pages
Class 12 Physics Project File
Shubham Gupta
No ratings yet
Data Governance: Building A Framework For MDM
Document27 pages
Data Governance: Building A Framework For MDM
desijnk
86% (7)
05 - Attila Aszalos, Calin Enachescu - Automatic Number Plate Recognition System For Iphone Devices
Document6 pages
05 - Attila Aszalos, Calin Enachescu - Automatic Number Plate Recognition System For Iphone Devices
kzilla
No ratings yet
R4PP01: Life Science Individual: Narrative Report
Document7 pages
R4PP01: Life Science Individual: Narrative Report
Luisa Soriano
No ratings yet
Visual and Audio Signal Processing Lab University of Wollongong
Document20 pages
Visual and Audio Signal Processing Lab University of Wollongong
jfranbripi793335
No ratings yet
A Brief Description of The Cyber Kill Chain
Document11 pages
A Brief Description of The Cyber Kill Chain
Fazlur Rehman
100% (2)
Be FH Doc 2020
Document5 pages
Be FH Doc 2020
Khan Dayyan
No ratings yet
Pandoc Mode Manual
Document11 pages
Pandoc Mode Manual
norbulinuks
No ratings yet
GlobeOSS iSM
Document4 pages
GlobeOSS iSM
Wasabiz Sitkrumat
No ratings yet
Introduction Wps Office
Document11 pages
Introduction Wps Office
Misha Williams
No ratings yet
ELID Matrixv
Document92 pages
ELID Matrixv
Precious Ann Perez Perez
No ratings yet
12mgu Btech2010 It
Document120 pages
12mgu Btech2010 It
jayakrrishnan
No ratings yet
ApplicationServer 2014R2 RevA Presentation1
Document96 pages
ApplicationServer 2014R2 RevA Presentation1
German
No ratings yet