Cz4041 1a Introduction

CZ4041/CE4041:
Machine Learning
Lecture 1a: Introduction
Sinno Jialin PAN
School of Computer Science and Engineering
NTU, Singapore
Homepage: http://www.ntu.edu.sg/home/sinnopan
General Information
 Instructor (solo):
 Dr. Sinno Jialin PAN
 Lecture time/venue
 Tuesdays 12:30 – 2:30pm @ LT7
 Tutorial time/venue
 Thursdays 8:30 – 9:30am @ LT6 (starting from 2nd week)
2
General Information (cont.)
Lectures
Weeks Topics
Week 1 – Week 12 Overview, Supervised Learning

Note: No classes in Week 4 Ensemble Learning, Unsupervised Learning,
(Chinese New Year) Dimensionality Reduction
Tutorials
Weeks
Week 2 – Week 12
Note: No tutorial in Week 3 (Overseas Conference)
3
General Information (cont.)
 Q&A
 Office Hours @ N4, #02b-44:
 Tuesdays 2:30 – 3:00pm
 Make an appointment via email [email protected]
 Send me questions via email
 Course Webpage
 In NTULearn
4
Evaluation
 Course project (40%)
• Group-based
• Course report (30%) + presentation video (10%)
 Open book final exam (60%)
 2 hours
5
Hot Keywords in the IT Sector
DATA
SCIENCE
Machine Learning
6
What is Machine Learning?
 Motivated by how human beings learn from
examples/experience/exercise
 Focuses on the development of computer programs

that can teach themselves to grow from data and
change when exposed to new data
7
A Motivating Example
 Given a face image, to classify the face gender:
 Once upon a time, to develop an AI system to solve

such a task, developers or domain experts need to
provide rules and implement them in the system
If the face has long hair and does not have moustache, then this is a
“female” face;
If the face has short hair and moustache, then this is a “male” face.
8
A Motivating Example (cont.)
 Limitations:
 Time consuming
 The defined rules may not be completed
 Not able to handle uncertainty
If the face has long hair and does not have moustache,
then this is a “female” face;
If the face has short hair and moustache, then this is a
“male” face.
9
A Motivating Example (cont.)
 How about letting the machine learn the rules by itself?
 The computer is presented with example inputs and their
desired outputs, given by human, and the goal is to “learn”
a general rule or “model” that maps inputs to outputs
Example
inputs
Outputs Female Female Male Male
Model
10
Machine Learning Definition
 Machine learning is a type of artificial intelligence
that provides computers with the ability to learn
from examples/experience without being explicitly
programmed
Inputs
Traditional AI: Machine Outputs
Rules
Inputs Prediction
Machine Learning: Machine Model (or
Outputs Rules)
Inputs Prediction Model Outputs

11
How to Represent an Example?
 Feature engineering (not machine learning focus)

 Representation learning (one of the crucial research
topics in machine learning)
 Deep learning is the current most effective form for
representation learning
12
Machine Learning = Deep Learning = AI?
 Machine learning is a field of AI – many other fields
 Deep learning is a type of methodologies of machine
learning – many other methodologies in machine learning
 Machine learning has become a primary mechanism for
data analytics (key in data science)
 Nowadays, machine learning is more and more
interdisciplinary:
 Distributed/parallel computing + machine learning 
Distributed/parallel machine learning
 Machine learning + hardware  AI chips
13
Different Paradigms/Settings
 Supervised Learning
 Unsupervised Learning
 Reinforcement Learning
14
Supervised Learning
 The examples presented to computers are pairs of
inputs and the corresponding outputs, the goal is to
“learn” a mapping or model from inputs to labels
Labeled
training data
Inputs: Face images
Outputs: Female or Male Female Female Male Male
𝑓𝑓: label = 𝑓𝑓(input)

Outputs are discrete values  classification
15 Labels are continuous values  regression
Supervised Learning – Regression I
16
Supervised Learning – Regression II
Stock price prediction

17
18
Unsupervised Learning
 The examples presented to computers are a set of
inputs without any outputs, the goal is to “learn”
an intrinsic structure of the examples, e.g., clusters
Unlabeled
training dataof examples, density of the examples
Inputs: Face images
Groups of similar faces
19
Unsupervised Learning – Clustering
User Segmentation
20
21
Reinforcement Learning
 Learning by interacting with an environment to
achieve a goal
 Goal: to learn an optimal policy mapping states to
actions
𝒂𝒂𝟎𝟎 𝒂𝒂𝟏𝟏 𝒂𝒂𝟐𝟐

𝑺𝑺𝟎𝟎 𝑺𝑺𝟏𝟏 𝑺𝑺𝟐𝟐 𝑺𝑺𝟑𝟑
𝒓𝒓𝟎𝟎 𝒓𝒓𝟏𝟏 𝒓𝒓𝟐𝟐
22
Reinforcement Learning (cont.)
• Deep Q-Network (DQN) [1]
• Play Atari 2600 Games
Reward
Agent
Policy
Action
[1] Mnih et al, Human-level control through deep reinforcement learning.

23 Nature, 2015
Course Content
Supervised Learning Unsupervised Learning
Ensemble
Learning
Regression Density
Dimensionality Estimation
Reduction
Classification Clustering
Out of scope but interesting topics

Reinforcement Learning: Reinforcement learning: a survey (covered by
CZ/CE4046)
Semi-supervised Learning: Semi-supervised learning literature survey
Active Learning: Active learning literature survey
Transfer Learning: A survey on transfer learning 24
Course Objective
To provide students with essential concepts
and principles of algorithms in machine
learning
To enable students to understand how to use
or design various machine learning
techniques to solve supervised learning and
unsupervised learning problems
25
Breadth and Depth
 Classification (through lectures):
 Bayesian Decision Theory
 Bayesian Classifiers (Naïve Bayes & Bayesian Networks)
 Decision Trees
 Artificial Neural Networks
 Support Vector Machines and Kernel Machines
(additional notes)
 Nearest-neighbor Classifiers
 Regression (through lectures):
 Linear Regularized Least-Squared Regression and its
Kernelized version (additional notes)
26
Breadth and Depth (cont.)
 Clustering (through lectures):
 K-means and its variants
 Hierarchical clustering
 Density Estimation (through lectures):
 Parametric methods
 Non-parametric methods
 Real-world Applications or Advanced Research
Topics (through course project)
27
Breadth and Depth (cont.)
 Focus on introducing well-known concepts and
fundamental methodologies of machine learning
 Motivations
 Induction of the mathematical models (mathematics)
 For those who want to learn more, some up-to-date
techniques and advanced issues will be mentioned
 Details cannot be covered in lecture, some additional
materials for reading will be suggested (optional)
28
Relationships to Other Modules
CZ4041/CE4041: Machine Learning
Modern AI approaches:
 CZ3005: Artificial Intelligence
 Classification:
 Bayesian Decision Theory
Classic AI approaches:
 Bayesian Classifiers (Naïve Bayes & Bayesian  Search
Networks)  First Order Logic
 Decision Trees
 Support Vector Machines & Kernelization
 Nearest-Neighbor Classifier  CZ4042/CE4042: Neural
 Regression: Networks
 Linear Regression & Kernelization Various Structures of Neural
Networks
 Clustering:
 Density Estimation
29
Relationships to Other Modules (cont.)
CZ4041/CE4041: Machine Learning  CZ4032/CE4032: Data Analytics and Mining
 Classification:  Data Preprocessing
 Bayesian Decision Theory  Classification:
 Bayesian Classifiers (Naïve Bayes & Bayesian  Decision Trees
Networks)  Rule-based Classifiers
 Decision Trees
 Naïve Bayes
 Nearest-Neighbor Classifiers
 Support Vector Machines & Kernelization
 Nearest-Neighbor Classifier
 Support Vector Machines
 Regression:  Clustering:
 Linear Regression & Kernelization
 Clustering:  Hierarchical clustering
 K-means and its variants  Density-based
 Association Rule Mining
 Density Estimation  Dimensionality Reduction
 Dimensionality Reduction
30 Aim: Deeply understand principles Aim: How to use

Mathematics Background
Various learning paradigms:
supervised learning, unsupervised learning, semi-
supervised learning, active learning, transfer learning,
reinforcement learning, etc.
Various types of methodologies:

graphical models, deep learning, empirical risk
minimization, entropy-based models, kernel methods, etc.
Various mathematical techniques:

Probability theory, linear algebra, calculus, optimization,
information theory, functional analysis, etc.
31
32
Course Schedule (Tentative)
Date Topics
Week 1 15th Jan. Introduction, Overview of Supervised Learning
Week 2 22rd Jan. Bayesian Decision Theory, Naïve Bayes Classifier I
Week 3 29th Jan. Naïve Bayes Classifier II, Bayesian Brief Networks
Week 4 5th Feb. No Classes – Chinese New Year
Week 5 12th Feb. Decision Trees & Generalization
Week 6 19th Feb. Perceptron, Multilayer Neural Networks
Week 7 (e-learning) 26th Feb. Nearest-Neighbor Classifier
Week 8 12th Mar. Support Vector Machines, Regularized Regression
Week 9 19th Mar. Model Evaluation, Ensemble Learning
Week 10 26th Mar. Clustering: K-means & Hierarchical Clustering
Week 11 2nd Apr. Parametric & Non-parametric Density Estimation
Week 12 9th Apr. Dimensionality Reduction
Textbook and Reference
 Textbook:
 Introduction to Machine Learning (2nd Ed.), by Ethem
Alpaydin, The MIT Press, 2010.
 Reference:
 Pattern Classification (2nd Ed.), by Richard Duda, Peter
Hart, and David Stork, Wiley-Interscience, 2000.
 Introduction to Data Mining, by Pang-Ning Tan,
Michael Steinbach, and Vipin Kumar, Addison Wesley,
2005.
 Pattern Recognition and Machine Learning, by
Christopher M. Bishop, Springer, 2006.
34
Useful Resources: Datasets
 UCI Repository:
 http://www.ics.uci.edu/~mlearn/MLRepository.html
 Kaggle:
 http://www.kaggle.com/
35
Useful Resources: Libraries
 scikit-learn (Python):
 http://scikit-learn.org/stable/
 MALLET (Java)
 http://mallet.cs.umass.edu/
 Weka (Java)
 http://www.cs.waikato.ac.nz/ml/weka/
 Tensorflow:
 https://www.tensorflow.org/
 Pytorch:
 https://pytorch.org/
 Many other libraries on deep learning
• http://deeplearning.net/software_links/
36
Useful Resources: Conferences
 International Conference on Machine Learning (ICML)
 Neural Information Processing Systems (NIPS)
 Conference on Learning Theory (COLT)
 Uncertainty in Artificial Intelligence (UAI)
 International Conference on AI & Statistics (AISTATS)
 International Joint Conference on Artificial Intelligence
(IJCAI)
 AAAI Conference on Artificial Intelligence (AAAI)
 International Conference on Learning Representations
(ICLR)
37
Useful Resources: Journals
 Journal of Machine Learning Research (JMLR)
 Machine Learning (MLJ)
 IEEE Transactions on Pattern Analysis and
Machine Intelligence (TPAMI)
 IEEE Transactions on Neural Networks and
Learning Systems (TNNLS)
 Artificial Intelligence (AIJ)
 Journal of Artificial Intelligence Research (JAIR)
38
Detailed Project Description
 This is a group-based course project
 Each group consists of 4-5 members
 Each group can either choose one selected Kaggle
competition or choose one selected research topic
as the course project
 List of selected Kaggle competitions and
research topics will be provided in the tutorial of
Week 2
39
Programming Languages
 Programming Languages:
 Any programming language can be used, e.g.,
Matlab, Python, C/C++, Java, R, etc
 Any open-source ML toolbox can be used
 Note: for Kaggle competitions, directly
using the source codes released by
participants are not allowed (20% penalty
will be made if found)
40
General Information of Kaggle
Kaggle.com Participants
Training file(s), A predictive

contains 𝒙𝒙𝑖𝑖 , 𝑦𝑦𝑖𝑖 ′ 𝑠𝑠 model
Test file only Prediction file only

contains 𝒙𝒙𝑖𝑖 ′ 𝑠𝑠 contains 𝑦𝑦𝑖𝑖 ′ 𝑠𝑠
Leaderboard
41
Submission (Kaggle)
 Submitted files:
1. A project report
2. A presentation video
3. The final .cvs file of your prediction results submitted
to the specific completion in Kaggle you participate
4. Your source codes (with a readme file)
 Notes:
 The submitted .cvs is to double check whether the
reported results are correct
 The submitted source codes are to double check
whether they are just copied from the ones released by
some participants
42
Submission (Research)
 Submitted files:
1. A project report
2. A presentation video
3. Your source codes (with a readme file)
 Notes:
 The submitted source codes are to double check
whether the reported results are correct
 Report format: 12 point font, single space, 20-25 pages
43
Format of Report and Video
 Presentation video:
 To introduce your course project in a video of 10-15
minutes long
 You can use any tool to produce the video, e.g., simply
using PowerPoint or other advanced tools
 File size ≤ 8M
 Some examples for reference:
https://www.youtube.com/channel/UCSBrGGR7JOiSyzl60OGdKYQ
https://www.youtube.com/channel/UC_sfvZvvPUbOQhDs_cqlx_A
 Report format:
 12 point font, single space, 20-25 pages
44
Content of Project Report (Kaggle)
 Specific roles of each group member
 An evaluation score and ranked position of your prediction
results for the specific competition in Kaggle
 Problem statement (using your own words instead of copy-
and-paste from Kaggle)
 Challenges of the problem
 Your proposed solution in detail (preprocessing, feature
engineering/representation learning, methodologies, etc)
 Experiments to demonstrate why the solution you proposed
is appropriate to solve the problem using experiments
 Conclusion: what you have learned from the project
45
Content of Project Report (Research)
 Specific roles of each group member
 A review on the specific research topic
 Your new proposed method if applicable
 Comparison experiments on state-of-the-art methods
(and your proposed method if applicable)
 Analysis on pros and cons of the compared methods
 Conclusion: you own insights on the research project
46
Assessments on Project Reports
Kaggle competitions Research-based projects
 Leaderboard performance  Literature review
 Convincingness  Comparison analysis
 Solution novelty  Methodology novelty
 Writing  Writing
This assessment is to evaluate

whether the organization of report is
clear and easy to follow, whether the
report contains a lot of typos
47
Assessments – Kaggle
 Leaderboard Performance: though all the listed
Kaggle competitions are completed, you can still
submit your results to Kaggle to obtain an
evaluation score and a ranking position
 The performance assessment is based on the
relatively ranking of your results on the specific
competition (i.e., top 20%, top 40%, top 60%, top
80%, and top 100%)
48
Assessments – Kaggle (cont.)
 Solution Novelty: as on Kaggle.com, most
participants or winners may discuss their solutions
on the forums of the specific competitions.
 If you propose a new and effective solution, you can
get bonus. You are encouraged to propose your own
solutions based on your own understandings on the
competitions
49
Assessments – Kaggle (cont.)
 Convincingness: the goal of the project report is
to convince readers that your proposed solution is
proper to solve the specific machine learning task.
In your report, you need to conduct experiments to
verify your proposed ideas
50
Assessments – Research
 Literature review: as this is a research project,
figuring out what have been done in the literature
is important. You should provide a comprehensive
review on the specific research topic studied in
your project
51
Assessments – Research (cont.)
 Comparison Analysis: you need to implement
various state-of-the-art methods for the research
topic studied in your research project, and analyze
their cons and pros with your own insights
52
Assessments – Research (cont.)
 Methodology Novelty: if you propose a new and
effective method for the specific research topic,
even though it might be incremental, you can get
bonus. You are encouraged to propose your own
methods based on your understandings on the
research topic
53
Key Dates
 Sent information on group members and the
project via email:
 by 22nd Feb. 2019
 Submit files, i.e., the project report, video, source
codes, through NTULearn:
 by 11:59pm, 21st Apr. 2019
54
Thank you!
55

Cz4041 1a Introduction

Uploaded by

Copyright:

Available Formats

Cz4041 1a Introduction

Uploaded by

Document Information

Original Description:

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Cz4041 1a Introduction

Uploaded by

Copyright:

Available Formats

CZ4041/CE4041:

Week 1 – Week 12 Overview, Supervised Learning

 Focuses on the development of computer programs

 Once upon a time, to develop an AI system to solve

Outputs Female Female Male Male

Inputs Prediction Model Outputs

 Feature engineering (not machine learning focus)

Inputs: Face images

Outputs: Female or Male Female Female Male Male

𝑓𝑓: label = 𝑓𝑓(input)

Stock price prediction

Inputs: Face images

Groups of similar faces

𝒂𝒂𝟎𝟎 𝒂𝒂𝟏𝟏 𝒂𝒂𝟐𝟐

• Play Atari 2600 Games

[1] Mnih et al, Human-level control through deep reinforcement learning.

Out of scope but interesting topics

30 Aim: Deeply understand principles Aim: How to use

Various types of methodologies:

Various mathematical techniques:

Training file(s), A predictive

Test file only Prediction file only

This assessment is to evaluate

You might also like