Practical File of AI and ML

Download as pdf or txt
Download as pdf or txt
You are on page 1of 26

SHAHEED BHAGAT SINGH

STATE
UNIVERSITY,FEROZEPUR,
PUNJAB

PRACTICAL FILE OF AI AND ML

Submitted by:- Submitted to:-


Akash Kumar Dr. Sunny Behal
2007755 (HOD of CSE)
CSE (Data Science)
Practical no.1
Linear regression in python
……………………………..
PRACTICAL NO.2
Predict Employee Attrition Using Machine Learning &
Python
1
………………………………….
PRACTICAL NO.3
Python Implementation of the K-Means Clustering
Algorithm

Here’s how to use Python to implement the K-Means


Clustering Algorithm. These are the steps you need to
take:

 Data pre-processing
 Finding the optimal number of clusters using the
elbow method
 Training the K-Means algorithm on the training
data set
 Visualizing the clusters

1. Data Pre-Processing. Import the libraries, datasets,


and extract the independent variables.
# importing libraries
import numpy as nm
import matplotlib.pyplot as mtp
import pandas as pd
# Importing the dataset
dataset = pd.read_csv('Mall_Customers_data.csv')
x = dataset.iloc[:, [3, 4]].values

2. Find the optimal number of clusters using the elbow


method. Here’s the code you use:
#finding optimal number of clusters using the elbow
method

from sklearn.cluster import KMeans


wcss_list= [] #Initializing the list for the values of WCSS
#Using for loop for iterations from 1 to 10.
for i in range(1, 11):
kmeans = KMeans(n_clusters=i, init='k-means++',
random_state= 42) kmeans.fit(x)
wcss_list.append(kmeans.inertia_)
mtp.plot(range(1, 11), wcss_list) mtp.title('The Elobw
Method Graph') mtp.xlabel('Number of clusters(k)')
mtp.ylabel('wcss_list')
mtp.show()

3. Train the K-means algorithm on the training dataset.


Use the same two lines of code used in the previous
section. However, instead of using i, use 5, because
there are 5 clusters that need to be formed. Here’s the
code:
#training the K-means model on a dataset
kmeans = KMeans(n_clusters=5, init='k-means++',
random_state= 42) y_predict= kmeans.fit_predict(x)
4. Visualize the Clusters. Since this model has five
clusters, we need to visualize each one.

#visulaizing the clusters


mtp.scatter(x[y_predict == 0, 0], x[y_predict == 0, 1], s =
100, c = 'blue', label = 'Cluster 1') #for first cluster
mtp.scatter(x[y_predict == 1, 0], x[y_predict == 1, 1], s =
100, c = 'green', label = 'Cluster 2') #for second cluster
mtp.scatter(x[y_predict== 2, 0], x[y_predict == 2, 1], s =
100, c = 'red', label = 'Cluster 3') #for third cluster
mtp.scatter(x[y_predict == 3, 0], x[y_predict == 3, 1], s =
100, c = 'cyan', label = 'Cluster 4') #for fourth cluster
mtp.scatter(x[y_predict == 4, 0], x[y_predict == 4, 1], s =
100, c = 'magenta', label = 'Cluster 5') #for fifth cluster
mtp.scatter(kmeans.cluster_centers_[:, 0],
kmeans.cluster_centers_[:, 1], s = 300, c = 'yellow', label
= 'Centroid')
mtp.title('Clusters of customers') mtp.xlabel('Annual
Income (k$)') mtp.ylabel('Spending Score (1-100)')
mtp.legend()
mtp.show()

…………………………………….
PRACTICAL No.4
Build a Movie Recommendation System in
Python using Machine Learning

How to build a Movie Recommendation System using


Machine Learning

The approach to build the movie recommendation


engine consists of the following steps.
Perform Exploratory Data Analysis (EDA) on the data
Build the recommendation system
Get recommendations

Step 1: Perform Exploratory Data Analysis (EDA) on the


data
The dataset contains two CSV files, credits, and movies.
The credits file contains all the metadata information
about the movie and the movie file contains the
information like name and id of the movie, budget,
languages in the movie that has been released, etc.
Let’s load the movie dataset using pandas.
from sklearn.feature_extraction.text import
CountVectorizer
from sklearn.metrics.pairwise import cosine_similarity
count_vectorizer =
CountVectorizer(stop_words="english")
count_matrix =
count_vectorizer.fit_transform(movies_df["soup"])
print(count_matrix.shape)

cosine_sim2 = cosine_similarity(count_matrix,
count_matrix)
print(cosine_sim2.shape)
movies_df = movies_df.reset_index()
indices = pd.Series(movies_df.index,
index=movies_df['title'])
def get_recommendations(title,
cosine_sim=cosine_sim):
idx = indices[title]
similarity_scores = list(enumerate(cosine_sim[idx]))
similarity_scores= sorted(similarity_scores,
key=lambda x: x[1], reverse=True)
similarity_scores= sim_scores[1:11]
# (a, b) where a is id of movie, b is similarity_scores

movies_indices = [ind[0] for ind in similarity_scores]


movies = movies_df["title"].iloc[movies_indices]
return movies

print("################ Content Based System


#############")
print("Recommendations for The Dark Knight Rises")
print(get_recommendations("The Dark Knight Rises",
cosine_sim2))
print()
print("Recommendations for Avengers")
print(get_recommendations("The Avengers",
cosine_sim2))

………………………
PRACTICAL No.5
Predicting when Employee will Leave your
company
…………………………………..

You might also like