CSL0777 L23

Download as pptx, pdf, or txt
Download as pptx, pdf, or txt
You are on page 1of 39

Program: B.

Tech VII Semester

CSL0777: Machine Learning


Unit No. 3
Supervised learning Part-2

Lecture No. 23
Support Vector Machine Algorithm
Mr. Praveen Gupta
Assistant Professor, CSA/SOET
Outlines
• SVM Algorithm for Machine Learning
• Types of SVM
• Hyper plane and support vectors in SVM
• How does Linear SVM work?
• How does Non Linear SVM work?
• Python implementation of the CNN algorithm
• References
Student Effective Learning Outcomes(SELO)
01: Ability to understand subject related concepts clearly along with
contemporary issues.
02: Ability to use updated tools, techniques and skills for effective domain
specific practices.
03: Understanding available tools and products and ability to use it effectively.
Support Vector Machine Algorithm
•Support Vector Machine or SVM is one of the most
popular Supervised Learning algorithms, which is
used for Classification as well as Regression
problems. However, primarily, it is used for
Classification problems in Machine Learning.
•The goal of the SVM algorithm is to create the best
line or decision boundary that can segregate n-
dimensional space into classes so that we can easily
put the new data point in the correct category in the
future. This best decision boundary is called a
hyperplane.
4 / 22
Support Vector Machine Algorithm
•SVM chooses the extreme points/vectors that help in creating the
hyperplane. These extreme cases are called as support vectors, and
hence algorithm is termed as Support Vector Machine.

4 / 22
Support Vector Machine Algorithm
•Suppose we see a strange cat that also has some features of
dogs, so if we want a model that can accurately identify
whether it is a cat or dog, so such a model can be created by
using the SVM algorithm.
•We will first train our model with lots of images of cats and
dogs so that it can learn about different features of cats and
dogs, and then we test it with this strange creature.
•So as support vector creates a decision boundary between
these two data (cat and dog) and choose extreme cases
(support vectors), it will see the extreme case of cat and dog.
On the basis of the support vectors, it will classify it as a cat.

4 / 22
Support Vector Machine Algorithm

4 / 22
Types of SVM

•Linear SVM: Linear SVM is used for linearly


separable data, which means if a dataset can be
classified into two classes by using a single straight
line, then such data is termed as linearly separable
data, and classifier is used called as Linear SVM
classifier.
•Non-linear SVM: Non-Linear SVM is used for non-
linearly separated data, which means if a dataset
cannot be classified by using a straight line, then such
data is termed as non-linear data and classifier used
is called as Non-linear SVM classifier.
4 / 22
Hyperplane in the SVM algorithm
•There can be multiple lines/decision boundaries to segregate
the classes in n-dimensional space, but we need to find out
the best decision boundary that helps to classify the data
points. This best boundary is known as the hyperplane of
SVM.
•The dimensions of the hyperplane depend on the features
present in the dataset, which means if there are 2 features (as
shown in image), then hyperplane will be a straight line. And if
there are 3 features, then hyperplane will be a 2-dimension
plane.
• We always create a hyperplane that has a maximum margin,
which means the maximum distance between the data points.

4 / 22
Support Vectors in the SVM algorithm

The data points or vectors that are the


closest to the hyperplane and which
affect the position of the hyperplane are
termed as Support Vector. Since these
vectors support the hyperplane, hence
called a Support vector.

4 / 22
How does SVM works(Linear SVM)?

The working of the SVM algorithm can be understood


by using an example. Suppose we have a dataset that
has two tags (green and blue), and the dataset has
two features x1 and x2. We want a classifier that can
classify the pair(x1, x2) of coordinates in either green
or blue.

4 / 22
How does SVM works(Linear SVM)?

4 / 22
How does SVM works(Linear SVM)?

4 / 22
How does SVM works(Linear SVM)?

SVM algorithm helps to find the best line or decision


boundary; this best boundary or region is called
as a hyperplane. SVM algorithm finds the closest
point of the lines from both the classes. These points
are called support vectors. The distance between the
vectors and the hyperplane is called as margin. And
the goal of SVM is to maximize this margin.
The hyperplane with maximum margin is called
the optimal hyperplane.

4 / 22
How does SVM works(Linear SVM)?

4 / 22
How does SVM works(Non Linear SVM)?

If data is linearly arranged, then we can separate it by using a


straight line, but for non-linear data, we cannot draw a single
straight line.

4 / 22
How does SVM works(Non Linear SVM)?

So to separate these data points, we need to


add one more dimension. For linear data, we
have used two dimensions x and y, so for non-
linear data, we will add a third dimension z. It
can be calculated as:
z=x2 +y2

4 / 22
How does SVM works(Non Linear SVM)?

4 / 22
How does SVM works(Non Linear SVM)?

4 / 22
How does SVM works(Non Linear SVM)?

4 / 22
Python implementation of the SVM algorithm

Example: There is a dataset given which contains


the information of various users obtained from the
social networking sites. There is a car making
company that has recently launched a new SUV car.
So the company wanted to check how many users
from the dataset, wants to purchase the car.

4 / 22
Python implementation of the SVM algorithm

4 / 22
Python implementation of the SVM algorithm

Steps to implement the SVM algorithm:


• Data Pre-processing step
• Fitting the SVM algorithm to the Training set
• Predicting the test result
• Test accuracy of the result(Creation of Confusion
matrix)
• Visualizing the test set result.

4 / 22
Python implementation of the SVM algorithm

1. Data Pre-processing step: In this step, we will


pre-process/prepare the data so that we can
use it in our code efficiently.
#Data Pre-procesing Step #
importing libraries
import numpy as nm
import matplotlib.pyplot as
mtp
import pandas as pd

#importing datasets
data_set=
pd.read_csv('user_data.c 4 / 22
sv')
Python implementation of the SVM algorithm

Now, we will extract the dependent and


independent variables from the given dataset.
Below is the code for it:
#Extracting Independent and dependent Variable
x= data_set.iloc[:, [2,3]].values
y= data_set.iloc[:, 4].values

4 / 22
Python implementation of the SVM algorithm

4 / 22
Python implementation of the SVM algorithm

Now we will split the dataset into a training set and


test set. Below is the code for it:
# Splitting the dataset into training and test set.
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test= train_test_split(x, y, te
st_size= 0.25, random_state=0)

4 / 22
Python implementation of the SVM algorithm

4 / 22
Python implementation of the SVM algorithm

we will do feature scaling because we want accurate


result of predictions. Here we will only scale the
independent variable because dependent variable
have only 0 and 1 values. Below is the code for it:
#feature Scaling
from sklearn.preprocessing import StandardScaler
st_x= StandardScaler()
x_train= st_x.fit_transform(x_train)
x_test= st_x.transform(x_test)

4 / 22
Python implementation of the SVM algorithm

2. Fitting SVM classifier to the Training data:


Now the training set will be fitted to the SVM
classifier. To create the SVM classifier, we will
import SVC class from Sklearn.svm library.
we have used kernel='linear', as here we are creating
SVM for linearly separable data.

4 / 22
Python implementation of the SVM algorithm

2 . Fitting SVM classifier to the Training data :


from sklearn.svm import SVC # "Support vector classifi
er"
classifier = SVC(kernel='linear', random_state=0)
classifier.fit(x_train, y_train)

4 / 22
Python implementation of the SVM algorithm

3. Predicting the Test Result


Our model is well trained on the training set, so we
will now predict the result by using test set data.
Below is the code for it:
#Predicting the test set result
y_pred= classifier.predict(x_test)

4 / 22
Python implementation of the SVM algorithm

4 / 22
Python implementation of the SVM algorithm

4. Test Accuracy of the result


Now we will create the confusion matrix here to check
the accuracy of the classification.
To create it, we need to import
the confusion_matrix function of the sklearn library.
After importing the function, we will call it using a
new variable cm.
The function takes two parameters, mainly y_test( the
actual values) and y_pred (the targeted value return
by the classifier).

4 / 22
Python implementation of the SVM algorithm

4. Test Accuracy of the result


#Creating the Confusion matrix
from sklearn.metrics import confusion_matrix
cm= confusion_matrix(y_test, y_pred)

4 / 22
Python implementation of the SVM algorithm

We can find the accuracy of the predicted result by


interpreting the confusion matrix. By above output,
we can interpret that 66+24= 90 (Correct Output) and
8+2= 10(Incorrect Output).

Accuracy = (TP+TN)\Total
(66+24)\100= .90
Therefore accuracy of the model is 90%

4 / 22
Learning Outcomes

The students have learn and understand the followings


• SVM Algorithm for Machine Learning
• Types of SVM
• Hyper plane and support vectors in SVM
• How does Linear SVM work?
• How does Non Linear SVM work?
• Python implementation of the CNN algorithm
References

1. Machine Learning for Absolute Beginners by Oliver Theobald. 2019


2. http://noracook.io/Books/Python/introductiontomachinelearningwithpyth
on.pdf
3. https://www.tutorialspoint.com/machine_learning_with_python/machine
_learning_with_python_tutorial.pdf
4. https://www.javatpoint.com/k-nearest-neighbor-algorithm-for-machine-
learning
Thank you

You might also like