Course Plan 21CSC307P - Machine Learning For Data Analytics

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

SRM INSTITUTE OF SCIENCE AND TECHNOLOGY

FACULTY OF ENGINEERING AND TECHNOLOGY


SCHOOL OF COMPUTING

COURSE PLAN
21CSC307P MACHINE LEARNING FOR DATA ANALYTICS
JUL – NOV 2024

Revision History:
Date Version Modification Modified by Reviewed by Authorized by
done
10-07-2024 1.0 Initial Release Dr. G.Premalatha Dr.C.Lakshmi
SRM Institute of Science and Technology, Kattankulathur

Table of Contents

1.0 General Details............................................................................................................................3

2.0 Reference Books..........................................................................................................................6

3.0 Prerequisites................................................................................................................................6

4.0 Instructional Objectives..............................................................................................................6

5.0 Overall Assessment Plan .............................................................................................................6

6.0 Tentative Test Schedule...............................................................................................................7

7.0 Detailed Test Plan........................................................................................................................8

8.0 Mini Project Details…………………………………………………………………………………………………………………8

9.0 Online Global Certification course /Real time projects ……………………………………………………………9

10.0 Detailed Session Plan .................................................................................................................10

11.0 Overall Execution Plan................................................................................................................13

2
SRM Institute of Science and Technology, Kattankulathur

1.0 General Details

Course Code: 21CSC307P


Course Title: Machine Learning for Data Analytics
Semester: V
Course Time: JUL – NOV 2024
Slot:
Batch
Day Batch 1 Batch 2
Hour Timing Hour Timing
Day order 1 3&4 9:45-11:30 8&9 2:20-4:00
Day order 2 - - - -
Day order 3 - - - -
Day order 4 - - - -
Day order 5 4 10:45-11:30 8 2:20-3:10

Location: University Building

Subject Handling Faculty details:

Faculty Name Section Batch Venue Mobile number


Dr.A.Hemavathi AD-1 Batch -I UB-507 8056591435
Dr.A.Shanthini AG-1 Batch -I UB-510 9487555326
Dr.M.Prakash AF-1 Batch-I UB-509 9894664948
Dr.A.V.Kalpana AE-1 Batch-I UB-508 9842164524
Dr.M.Ramprasath AF-2 Batch -II UB-509 9994041003
Dr.G.Premalatha AD-2 Batch-II UB-507 9952042616
Dr.Ragupathi AE-2 Batch -II UB-508 9600206930

3
Course Course L T P C
Course Code 21CSC307P Machine Learning for Data Analytics C Professional Core
Name Category 2 1 0 3

Pre-requisite Co-requisite Progressive


Nil Nil Nil
Courses Courses Courses
Course Offering Department Data Science and Business Systems Data Book / Codes/Standards Nil

Course Learning Rationale (CLR): The purpose of learning this course is to: Program Learning Outcomes (PO)
CLR-1 : Understanding Human learning aspects. 1 2 3 4 5 6 7 8 9 10 11 12 PSO
CLR-2 : Acquaintance with primitives in the learning process by computer.
CLR-3 : Develop the linear learning models and classification in machine learning

Environment & Sustainability


Analysis, Design, Research
CLR-4 : Implement the clustering techniques and their utilization in machine learning

Individual & Team Work


Engineering Knowledge

Design & Development

Project Mgt. & Finance


CLR-5 : Implement the tree based machine learning techniques and to appreciate their capability

Modern Tool Usage

Life Long Learning


Society & Culture
Problem Analysis

Communication

PSO – 3
PSO - 1

PSO - 2
Ethics
Course Learning Outcomes (CO): At the end of this course, learners will be able to:

CO-1 : Demonstrate knowledge of learning algorithms and concept learning through implementation for sustainable solutions of applications. - - - - - - - - - - - - 1 - -
CO-2 : Evaluation of different algorithms on well formulated problems along with stating Valid conclusions that the evaluation supports. - - - - - - - - - - - - - 2 -

CO-3 : Formulate a given problem within the Bayesian learning framework with focus on Building lifelong learning ability. - - - - - - - - - - - - - - 2
Analyze research-based problems using Machine learning techniques and Apply different clustering algorithms used in machine - - - - - - - - - - - - 2 - -
CO-4 :
learning to generic datasets and Specific multidisciplinary domains.
CO-5 : Evaluate decision tree learning algorithms. - - - - - - - - - - - - - - 1

Unit-1: INTRODUCTION AND TYPES OF LEARNING 9 hours


Introduction: Machine Learning: What & Why? - Examples of Machine Learning applications, Training versus Testing, Positive and Negative Class, Cross-validation. Types of Learning:
Supervised, Unsupervised and Semi-Supervised Learning. The Curse of dimensionality-Over fitting and under fitting-Linear regression-Bias and Variance tradeoff-Regularization-Learning
Curve-Classification-Error and noise-Parametric vs. non-parametric models-Linear Algebra for machine learning

T1: Building programs to work with the data pre-processing in python


T2: Building programs to work with linear regression in python
T3: Building programs to work with cross validation in Python

Unit-2: DESIGN AND ANALYSIS OF MACHINE LEARNING ALGORITHMS 9 hours


Guidelines for machine learning experiments, Cross Validation (CV) and resampling – K-fold CV, bootstrapping, measuring classifier performance, assessing a single classification
algorithm and comparing two classification algorithms – t test, McNemar’s test, K-fold CV paired t test Performance metrics-MSE, accuracy, confusion matrix, precision, recall, F1-
score-Linear Regression with multiple variables-Logistic Regression-spam filtering with logistic regression

T4: Building programs to performance metrics in python


T5: Building programs work with linear regression with multiple variables in Python
T6: Building programs work with logistic regression in python
Unit-3: DISTANCE BASED MODELS 9 hours
Ridge Regression-Maximum likelihood estimation (least squares)- Principal component analysis- K nearest neighbour classification –Gaussian Naive Bayes classification-Multinomial
Naïve Bayes classification-Bernoulli Naïve Bayes Classification-Comparison of Gaussian, Multinomial, Bernoulli naive bayes classification -Support vector machine-Support vector
machine + kernels-Multi class classification- -Application: face recognition with PCA.

T7: Building python programs to use principal component analysis


T8: Building python programs to use Naïve Bayes classification
T9: Building programs to use Support Vector Machine
Unit-4: CLUSTERING TECHNIQUES 9 hours
Measuring (dis)similarity-Evaluating output of clustering methods-Spectral clustering-Hierarchical clustering-Agglomerative clustering-Divisive clustering-Choosing the number of
clusters-Clustering data points and features-Bi-clustering-Multi-view clustering-K-Means clustering-K-medoids clustering-Application: image segmentation using K-means clustering

T10: Building programs to implement Hierarchical clustering


T11: Building programs to implement K-Means clustering
T12: Building programs to perform cluster evaluation

Unit-5: TREE BASED MODELS 9 hours


Decision tree representation-Basic decision tree learning algorithm-Inductive bias in decision tree Decision tree construction-Issues in decision tree-Classification and regression trees
(CART)- Random Forest-Random Forest with scikit-learn Minority Class, Impurity Measures – Gini Index and Entropy, BestSplit -Multivariate adaptive regression trees (MART)-
Introduction to Artificial Neural Networks-Perceptron learning

T13: Building programs to implement decision tree algorithm


T14: Building programs to implement random forest algorithm
T15: Building programs to implement Artificial Neural Networks

1. Ethem Alpaydin, “Introduction to Machine Learning”, MIT Press, Fourth Edition, 2020. 4. Tom Mitchell, "Machine Learning", McGraw-Hill, 1997. Sebastian Raschka, Vahid Mirjilili,‖Python Machine Learning and
deep learning‖, 2nd edition, kindle book, 2018
Learning
2. Stephen Marsland, “Machine Learning: An Algorithmic Perspective, “Second Edition”, CRC Press, 2014. 5. Carol Quadros,‖Machine Learning with python, scikit-learn and Tensorflow‖, Packet Publishing, 2018.
Resources

3. Kevin P. Murphy, ―Machine Learning: A Probabilistic Perspective‖, MIT Press, 2012. 6. Gavin Hackeling,‖ Machine Learning with scikit-learn‖, Packet publishing, O‘Reily, 2018.

Learning Assessment
Bloom’sLevel of Thinking Continuous Learning Assessment (CLA) - By the Course By The CoE
Faculty
CLA-1 Average of CLA-2 Project Report and Viva Final
Unit test (20%) Based Learning Voce Examination
(60%) (20% Weightage) (0% weightage)
Theory Practice Theory Practice Theory Practice Theory Practice
Level 1 Remember 15% - - 15% 15% - -
Level 2 Understand 25% - - 20% 25% - -
Level 3 Apply 30% - - 25% 30% - -
Level 4 Analyze 30% - - 25% 30% - -
Level 5 Evaluate - - - 10% - - -
Level 6 Create - - - 5% - - -
Total 100 % 100 % 100 % -

Course Designers
Experts from Industry Experts from Higher Technical Institutions Internal Experts
1. Mr. E Nagarajan, R&D Head, Solvedge Technology 1.Dr. Anandhakumar P 1. Dr.M.Lakshmi, Prof., DSBS, SRMIST
Professor, Madras Institute of Technology, Chrompet 2. Dr.Shobanadevi, DSBS
SRM Institute of Science and Technology, Kattankulathur

2.0 Reference Books

1.Ethem Alpaydin, “Introduction to Machine Learning”, MIT Press, Fourth Edition, 2020.
2. Stephen Marsland, “Machine Learning: An Algorithmic Perspective, “Second Edition”, CRC
Press, 2014.
3. Kevin P. Murphy, “Machine Learning: A Probabilistic Perspective”, MIT Press, 2012
4.Tom Mitchell, “Machine Learning”, McGraw-Hill, 1997.
5.Sebastian Raschka, Vahid Mirjilili, “Python Machine Learning and deep learning”, 2nd edition,
kindle book, 2018
6. Carol Quadros, “Machine Learning with python, scikit-learn and Tensorflow”, Packet
Publishing, 2018.
7. Gavin Hackeling, “Machine Learning with scikit-learn”, Packet publishing, O‘Reily, 2018

3.0 Prerequisites
NIL

4.0 Instructional Objectives


1. Understanding Human learning aspects.

2. Acquaintance with primitives in the learning process by computer

3. Develop the linear learning models and classification in machine learning

4. Implement the clustering techniques and their utilization in machine learning

5. Implement the tree-based machine learning techniques and to appreciate their capability

5.0 Overall Assessment Plan

# Com Portion
Mode of
po to be Topics to be Assessed Mark
Assessment
nent Covered
1 1A Types of Learning, Linear algebra Written Test 4
Unit – I
1B Unit – I Problem understanding, do data Project based 10
preprocessing, apply linear regression assessment
cross validation using python
1C Unit – I Preprocessing the data, apply linear Online Global 2
regression cross validation using Certification Course
python in real time projects / Realtime Project
2 2A Unit– II Cross Validation (CV) and resampling, Written Test 4
problems on MSE, accuracy, confusion
matrix, precision, recall, F1-score
2B Unit– II analyze performance metrics, apply Project based 10
linear regression with multiple assessment
variables & logistic regression

4
SRM Institute of Science and Technology, Kattankulathur

2C Unit– II analyze performance metrics, apply Online Global 2


linear regression with multiple Certification Course
variables & logistic regression in real / Realtime Project
time projects
3A Unit– III problems involving the derivation of
3 Written Test 4
maximum likelihood estimators for
different models, Multinomial Naive
Bayes classification
3B Unit– III Implement Naïve Bayes & SVM
Project based 10
classification, Use PCA
assessment
3C Unit– III Apply PCA, implement Naïve Bayes &
Online Global 2
SVM classification,
Certification Course
/ Realtime Project
4 4A Unit– IV Measuring (Dis)similarity, Evaluating
Written Test 4
Output of Clustering Methods,
Agglomerative Clustering
4B Unit– IV Implement hierarchical & K-means Project based 10
clustering, & evaluate its performance assessment
4C Unit– IV
Apply hierarchical & K-means Online Global 2
clustering, & evaluate its performance Certification Course
/ Realtime Project
5 5A Unit– V Decision Tree Representation,
Written Test 4
CART,MART, Introduction to Artificial
Neural Networks
5B Unit– V Apply Random Forest, Decision tree Project based 10
algorithm, & build ANN model. assessment
5C Unit–V Apply Random Forest, Decision tree Online Global 2
algorithm, & ANN model in real time Certification Course
project / Realtime Project
6 6A Final Final project Final Presentation 10
6B project Online Global with front-end, 10
review Certification Course / Realtime Project Report Submission,
Demo
Total Marks 100

6.0 Tentative Test Schedule

# Tentative date Test Marks Portion Duration

1 30.7.24 Written Test 4 Unit 1 50 minutes


Assignment -problems 4 Unit 2
2 13.8.24 solving 50 minutes
4 Unit 3
3 5.9.24 Written Test 50 minutes
4 Unit 4
4 27.9.24 Char preparation 50 minutes
4 Unit 5
5 5.10.24 Written Test 50 minutes

5
SRM Institute of Science and Technology, Kattankulathur

7.0 Detailed Test Plan

Test Internal
Type of Test Tentative Question Pattern Mode
Componen Mark
Date
ts
Exam Pattern:
Concept
Understanding Physical
1A Written Test 30.7.24 4
Questions – 5 * 2 = 10 Exam
Scenario based / HOTs
Questions – 3 * 5 = 15

Total: 10 Marks
Assignment -problems Physical
2A 13.8.24 4 Questions – 2 * 5 = 10
Exam
solving

Total: 25 Marks
Exam Pattern:
Physical
3A Written Test 5.9.24 4 problems :10 * 2
Exam
= 20
mcq: 1 * 5 = 5
Total: 10 Marks
Activity Pattern:
Team size - 5 Physical
4A chart preparation 27.9.24 4
Each student should Exam
present

Total: 25 Marks
Exam Pattern:
Concept Understanding
Physical
5A Written Test 5.10.24 4 question 5 * 2 = 10
Exam
Scenario based / HOTs
Questions – 3 * 5 = 15

8.0 Mini Project details:

Tentative
Test
date of Total
Compone Artifacts Mark Split-up
final Marks
nts
evaluation
Objective – 1 Marks
Problem understanding,
Dataset identification &
do data preprocessing,
24.7.24 & preprocessing – 3 Marks
1B apply linear regression 10
31.7.24 Presentation - 3 Marks
cross validation using
Viva - 3 Marks
python
analyze performance Problem Identification – 1 Marks
7.8.24 &
2B metrics, apply linear 10 Presentation - 3
14.7.24
regression with multiple Demo - 3 Marks
6
SRM Institute of Science and Technology, Kattankulathur

variables & logistic Viva - 3


regression
Problem Identification – 1 Marks
Implement Naïve Bayes & Presentation - 3
22.7.24 &
3B SVM classification, Use 10 Demo - 3 Marks
30.7.24
PCA Viva - 3

Problem Identification – 1 Marks


Implement hierarchical & Presentation - 3
4B 8.10.24 K-means clustering, & 10 Demo - 3 Marks
evaluate its performance Viva - 3

Problem Identification – 1 Marks


Apply Random Forest, Presentation - 3
5B 16.10.24 Decision tree algorithm,& 10 Demo - 3 Marks
ANN. Viva - 3

Final Presentation - 2
Demo - 2 Marks
6A 29.10.24 Final Review 10 Viva - 3 Marks
Report - 3 Marks

9.0 Online Global Certification Course / Real-Time Project Plan


Test Tentative
Marks Split-up
Components Date
Registering for Online Global Certification /
1C 2 24.7.24 Letter from the employer for Real-time Projects

20% Assignment submission / online course


completion of Online Global Certification / 20 %
2C 2 7.8.24
Real-time project completion

40% Completion of Online Global Certification /


3C 2 22.7.24 40% Real-time project completion

60% Completion of Online Global Certification /


4C 2 8.10.24 60% Real-time project completion certificate

80% Completion of Online Global Certification /


5C 2 23.10.24 80% Real-time project completion certificate

Completion of Online Global Certification /


Real-time project completion certificate from the
6B 10 29.10.24
employer

10.0 Detailed Session Plan

7
SRM Institute of Science and Technology, Kattankulathur

Teaching
# Topics to be covered Hours Ref Testing method
method
Unit 1
Introduction: Machine Learning: Illustration usingexample
What & Why? - Examples of Machine
1 PPT
1 Learning applications
Training versus Testing, Positive Illustration using
2 and Negative Class, Cross-validation 1 PPT example/Think-pair-share
activity
Types of Learning: Illustration,
Supervised 1 PPT Demonstration using
3 examples
Unsupervised and Semi-Supervised Illustration,
4 Learning 1 PPT Demonstration using examples

The Curse of dimensionality-Over Illustration,


fitting and under fitting Demonstration using examples
5 Building programs to work with the 1 PPT
data pre-processing in python

Linear regression-Bias and Variance Illustration,


tradeoff-Regularization Demonstration using examples
6 Building programs to work with 1 PPT
linear regression in python

-Learning Illustration,
7 Curve-Classification-Error and noise 1 PPT Demonstration using examples

Parametric vs. non-parametric Illustration,


8 models 1 PPT Demonstration using examples

Linear Algebra for machine learning Illustration,


9 Building programs to work with 1 PPT Demonstration using examples
cross validation in Python
Unit 2
Guidelines for machine learning Illustration,
10 experiments, Cross Validation (CV) 1 PPT Demonstration using
and resampling examples
K-fold CV, bootstrapping, measuring Illustration,
11 classifier performance 1 PPT Demonstration using
examples
assessing a single classification Illustration,
algorithm and comparing two Demonstration using
12 1 PPT
classification algorithms – t test, examples
McNemar’s test
K-fold CV paired t test Performance Illustration,
13 metrics-MSE, accuracy 1 PPT Demonstration using
examples
confusion matrix, precision, recall, Illustration,
F1-score Demonstration using
14 1 PPT
T4: Building programs to examples
performance metrics in python

8
SRM Institute of Science and Technology, Kattankulathur

Linear Regression with multiple Illustration,


variables Demonstration using
15 T5: Building programs work with 1 PPT examples
linear regression with multiple
variables in Python
Logistic Regression-spam filtering Illustration,
with logistic regression Demonstration using
16 1 PPT
T6: Building programs work with examples
logistic regression in python
Unit 3
Ridge Regression-Maximum Demonstration using examples
17 likelihood estimation (least squares) 1 PPT

Principal component analysis- K Demonstration using examples


18 1 PPT
nearest neighbour classification
T7: Building python programs to use
19 PPT
principal component analysis
Gaussian Naive Bayes classification- Illustration,
20 Multinomial Naïve Bayes 1 PPT Demonstration using examples
classification
T8: Building python programs to use
21 PPT
Naïve Bayes classification
Bernoulli Naïve Bayes Classification Illustration,
22 1 PPT
Demonstration using examples
Comparison of Gaussian, Illustration,
23 Multinomial, Bernoulli naive bayes 1 PPT Demonstration using examples
classification
Support vector machine-Support Illustration,
24 1 PPT
vector machine + kernels Demonstration using examples
Multi class classification Illustration,
25 1 PPT
Demonstration using examples
T9: Building programs to use
26 PPT
Support Vector Machine
Application: face recognition with Illustration,
27 1 PPT
PCA. Demonstration using examples
Unit 4
Measuring (dis)similarity-Evaluating Group Discussion,
28 output of clustering methods- 1 PPT Illustration by examples
Spectral clustering
Hierarchical clustering Group Discussion,
29 T10: Building programs to 1 PPT Illustration by examples
implement Hierarchical clustering
Agglomerative clustering -Divisive Illustration,
30 1 PPT
clustering Demonstration using examples
Choosing the number of Illustration,
31 clusters-Clustering data points and 1 PPT Demonstration using examples
features
Bi-clustering-Multi-view clustering Illustration,
32 1 PPT
Demonstration using examples
K-Means clustering Illustration,
33 T11: Building programs to 1 PPT Demonstration using examples
implement K-Means clustering
34 K-medoids clustering 1 PPT Illustration,
9
SRM Institute of Science and Technology, Kattankulathur

Demonstration using examples


Application: image segmentation Illustration,
35 1 PPT
using K-means clustering Demonstration using examples
T12: Building programs to perform
36 1 PPT
cluster evaluation
Unit 5
Decision tree representation-Basic Illustration,
37 1 PPT
decision tree learning algorithm Demonstration using examples
Inductive bias in decision tree Illustration,
Decision tree construction-Issues in Demonstration using examples
38 decision tree 1 PPT
T13: Building programs to
implement decision tree algorithm
Classification and regression trees Group Discussion,
39 1 PPT
(CART) Illustration by examples
Random Forest-Random Forest with Group Discussion,
40 scikit-learn Minority Class 1 PPT Illustration by examples,
Poster presentation
T14: Building programs to
41 1 PPT
implement random forest algorithm
Impurity Measures – Gini Index and Illustration,
42 1 PPT
Entropy BestSplit Demonstration using examples
Multivariate adaptive regression Illustration,
43 1 PPT
trees (MART) Demonstration using examples
Introduction to Artificial Neural Illustration,
44 1 PPT
Networks-Perceptron learning Demonstration using examples
T15: Building programs to
45 implement Artificial Neural 1 PPT
Networks

11. Overall Execution Plan:


Target
# Activity Dates Responsibilities Assigned to

Select the list of topics unit-wise to prepare concepts.


Send the list of topics planned to audit professors for review

Guidelines for video preparation:


1. Each video should cover separate topic.
Video Content 2. Duration of video to be from 7 to 10 mins only.
1 22-07-2024 All faculty
Preparation 3. Common template to be used.
4. Formal Dress code while recording.
5. Video should cover - Introduction about the topic, Overview,
Illustration of the concepts, Demo (if any), Applications,
Conclusion
6. Approval from the Audit Professor

10
SRM Institute of Science and Technology, Kattankulathur

1. The faculty has to prepare for all the 5 units.


2. Questions have to be framed on own and not to be taken as
such from any other source. Other sources can be referred, but
the question has to be modified, say with different example
Question Bank program, and so on.
2 26-07-2024 All faculty
Preparation 3. Solution is required for all questions. Each unit should
contain the following:
• Multiple Choice Questions – 10
• Concept Understanding Questions – 5
• Scenario based / HOTs Questions - 3

24-07-2024
06-08-24 1. Check for the quality of the questions as per the category in
Question Bank 23-08-24 the question bank. Course
3
Scrutiny 17-09-24 2. Ensure there are no repetitions. Coordinator
30-09-24 3. Approval from the Audit Professor

30.07.24 1. Select the question from Question Bank


13.08.24 2. Share the QP to audit professor for review
Course
4 Cycle Test 05.09.24 3. Plan for cycle tests question paper printing, print and
Coordinator
27.09.24 distribute.
05.10.24 4. Approval from the Audit Professor

1. Responsible for the preparation of course file as per the


checklist.
2. At the end of each CT exam, files should be updated and got
30-08-2024
verified from the Team Head.
Course File 03-10-2024 Course
6 3. Participate in result analysis activity.
Preparation 05-11-2024 Coordinator
4. Course Files are to be prepared for the department and the
02-12-2024
faculty is responsible for the preparation including CO-PO
Mapping, attainment of Cos, etc.
5. Approval from the Audit Professor

12-07-2024 1. Scribe and prepare minutes of meeting for all meetings


Feedback Collection 08-08-2024 conducted.
Course
7 and Minutes of 13-09-2024 2. Share the MoM to Audit professor on the same day or the
Coordinator
Meeting 03-10-2024 next day of meeting.
12-11-2024 3. Approval from the Audit Professor

11

You might also like