Cz4041 1a Introduction

Download as pdf or txt
Download as pdf or txt
You are on page 1of 55

CZ4041/CE4041:

Machine Learning
Lecture 1a: Introduction
Sinno Jialin PAN
School of Computer Science and Engineering
NTU, Singapore
Homepage: http://www.ntu.edu.sg/home/sinnopan
General Information
 Instructor (solo):
 Dr. Sinno Jialin PAN
 Lecture time/venue
 Tuesdays 12:30 – 2:30pm @ LT7
 Tutorial time/venue
 Thursdays 8:30 – 9:30am @ LT6 (starting from 2nd week)

2
General Information (cont.)
Lectures
Weeks Topics

Week 1 – Week 12 Overview, Supervised Learning


Note: No classes in Week 4 Ensemble Learning, Unsupervised Learning,
(Chinese New Year) Dimensionality Reduction

Tutorials
Weeks
Week 2 – Week 12
Note: No tutorial in Week 3 (Overseas Conference)

3
General Information (cont.)
 Q&A
 Office Hours @ N4, #02b-44:
 Tuesdays 2:30 – 3:00pm
 Make an appointment via email [email protected]
 Send me questions via email
 Course Webpage
 In NTULearn

4
Evaluation
 Course project (40%)
• Group-based
• Course report (30%) + presentation video (10%)
 Open book final exam (60%)
 2 hours

5
Hot Keywords in the IT Sector

DATA
SCIENCE

Machine Learning

6
What is Machine Learning?
 Motivated by how human beings learn from
examples/experience/exercise

 Focuses on the development of computer programs


that can teach themselves to grow from data and
change when exposed to new data

7
A Motivating Example
 Given a face image, to classify the face gender:

 Once upon a time, to develop an AI system to solve


such a task, developers or domain experts need to
provide rules and implement them in the system
If the face has long hair and does not have moustache, then this is a
“female” face;
If the face has short hair and moustache, then this is a “male” face.
8
A Motivating Example (cont.)
 Limitations:
 Time consuming
 The defined rules may not be completed
 Not able to handle uncertainty

If the face has long hair and does not have moustache,
then this is a “female” face;
If the face has short hair and moustache, then this is a
“male” face.

9
A Motivating Example (cont.)
 How about letting the machine learn the rules by itself?
 The computer is presented with example inputs and their
desired outputs, given by human, and the goal is to “learn”
a general rule or “model” that maps inputs to outputs

Example
inputs

Outputs Female Female Male Male

Model
10
Machine Learning Definition
 Machine learning is a type of artificial intelligence
that provides computers with the ability to learn
from examples/experience without being explicitly
programmed
Inputs
Traditional AI: Machine Outputs
Rules

Inputs Prediction
Machine Learning: Machine Model (or
Outputs Rules)

Inputs Prediction Model Outputs


11
How to Represent an Example?

 Feature engineering (not machine learning focus)


 Representation learning (one of the crucial research
topics in machine learning)
 Deep learning is the current most effective form for
representation learning

12
Machine Learning = Deep Learning = AI?
 Machine learning is a field of AI – many other fields
 Deep learning is a type of methodologies of machine
learning – many other methodologies in machine learning
 Machine learning has become a primary mechanism for
data analytics (key in data science)
 Nowadays, machine learning is more and more
interdisciplinary:
 Distributed/parallel computing + machine learning 
Distributed/parallel machine learning
 Machine learning + hardware  AI chips

13
Different Paradigms/Settings
 Supervised Learning
 Unsupervised Learning
 Reinforcement Learning

14
Supervised Learning
 The examples presented to computers are pairs of
inputs and the corresponding outputs, the goal is to
“learn” a mapping or model from inputs to labels
Labeled
training data

Inputs: Face images

Outputs: Female or Male Female Female Male Male

𝑓𝑓: label = 𝑓𝑓(input)


Outputs are discrete values  classification
15 Labels are continuous values  regression
Supervised Learning – Regression I

16
Supervised Learning – Regression II

Stock price prediction


17
Different Paradigms/Settings
 Supervised Learning
 Unsupervised Learning
 Reinforcement Learning

18
Unsupervised Learning
 The examples presented to computers are a set of
inputs without any outputs, the goal is to “learn”
an intrinsic structure of the examples, e.g., clusters
Unlabeled
training dataof examples, density of the examples

Inputs: Face images

Groups of similar faces

19
Unsupervised Learning – Clustering

User Segmentation
20
Different Paradigms/Settings
 Supervised Learning
 Unsupervised Learning
 Reinforcement Learning

21
Reinforcement Learning
 Learning by interacting with an environment to
achieve a goal
 Goal: to learn an optimal policy mapping states to
actions

𝒂𝒂𝟎𝟎 𝒂𝒂𝟏𝟏 𝒂𝒂𝟐𝟐


𝑺𝑺𝟎𝟎 𝑺𝑺𝟏𝟏 𝑺𝑺𝟐𝟐 𝑺𝑺𝟑𝟑
𝒓𝒓𝟎𝟎 𝒓𝒓𝟏𝟏 𝒓𝒓𝟐𝟐

22
Reinforcement Learning (cont.)
• Deep Q-Network (DQN) [1]

• Play Atari 2600 Games

Reward

Agent
Policy

Action

[1] Mnih et al, Human-level control through deep reinforcement learning.


23 Nature, 2015
Course Content
Supervised Learning Unsupervised Learning
Ensemble
Learning

Regression Density
Dimensionality Estimation
Reduction
Classification Clustering

Out of scope but interesting topics


Reinforcement Learning: Reinforcement learning: a survey (covered by
CZ/CE4046)
Semi-supervised Learning: Semi-supervised learning literature survey
Active Learning: Active learning literature survey
Transfer Learning: A survey on transfer learning 24
Course Objective
To provide students with essential concepts
and principles of algorithms in machine
learning
To enable students to understand how to use
or design various machine learning
techniques to solve supervised learning and
unsupervised learning problems

25
Breadth and Depth
 Classification (through lectures):
 Bayesian Decision Theory
 Bayesian Classifiers (Naïve Bayes & Bayesian Networks)
 Decision Trees
 Artificial Neural Networks
 Support Vector Machines and Kernel Machines
(additional notes)
 Nearest-neighbor Classifiers
 Regression (through lectures):
 Linear Regularized Least-Squared Regression and its
Kernelized version (additional notes)
26
Breadth and Depth (cont.)
 Clustering (through lectures):
 K-means and its variants
 Hierarchical clustering
 Density Estimation (through lectures):
 Parametric methods
 Non-parametric methods
 Real-world Applications or Advanced Research
Topics (through course project)

27
Breadth and Depth (cont.)
 Focus on introducing well-known concepts and
fundamental methodologies of machine learning
 Motivations
 Induction of the mathematical models (mathematics)
 For those who want to learn more, some up-to-date
techniques and advanced issues will be mentioned
 Details cannot be covered in lecture, some additional
materials for reading will be suggested (optional)

28
Relationships to Other Modules
CZ4041/CE4041: Machine Learning
Modern AI approaches:
 CZ3005: Artificial Intelligence
 Classification:
 Bayesian Decision Theory
Classic AI approaches:
 Bayesian Classifiers (Naïve Bayes & Bayesian  Search
Networks)  First Order Logic
 Decision Trees
 Artificial Neural Networks
 Support Vector Machines & Kernelization
 Nearest-Neighbor Classifier  CZ4042/CE4042: Neural
 Regression: Networks
 Linear Regression & Kernelization Various Structures of Neural
Networks
 Clustering:
 K-means and its variants
 Hierarchical clustering

 Density Estimation
29
Relationships to Other Modules (cont.)
CZ4041/CE4041: Machine Learning  CZ4032/CE4032: Data Analytics and Mining
 Classification:  Data Preprocessing
 Bayesian Decision Theory  Classification:
 Bayesian Classifiers (Naïve Bayes & Bayesian  Decision Trees
Networks)  Rule-based Classifiers
 Decision Trees
 Naïve Bayes
 Artificial Neural Networks
 Nearest-Neighbor Classifiers
 Support Vector Machines & Kernelization
 Artificial Neural Networks
 Nearest-Neighbor Classifier
 Support Vector Machines
 Regression:  Clustering:
 Linear Regression & Kernelization
 K-means and its variants
 Clustering:  Hierarchical clustering
 K-means and its variants  Density-based
 Hierarchical clustering
 Association Rule Mining
 Density Estimation  Dimensionality Reduction
 Dimensionality Reduction

30 Aim: Deeply understand principles Aim: How to use


Mathematics Background
Various learning paradigms:
supervised learning, unsupervised learning, semi-
supervised learning, active learning, transfer learning,
reinforcement learning, etc.

Various types of methodologies:


graphical models, deep learning, empirical risk
minimization, entropy-based models, kernel methods, etc.

Various mathematical techniques:


Probability theory, linear algebra, calculus, optimization,
information theory, functional analysis, etc.
31
32
Course Schedule (Tentative)
Date Topics
Week 1 15th Jan. Introduction, Overview of Supervised Learning
Week 2 22rd Jan. Bayesian Decision Theory, Naïve Bayes Classifier I
Week 3 29th Jan. Naïve Bayes Classifier II, Bayesian Brief Networks
Week 4 5th Feb. No Classes – Chinese New Year
Week 5 12th Feb. Decision Trees & Generalization
Week 6 19th Feb. Perceptron, Multilayer Neural Networks
Week 7 (e-learning) 26th Feb. Nearest-Neighbor Classifier
Week 8 12th Mar. Support Vector Machines, Regularized Regression
Week 9 19th Mar. Model Evaluation, Ensemble Learning
Week 10 26th Mar. Clustering: K-means & Hierarchical Clustering
Week 11 2nd Apr. Parametric & Non-parametric Density Estimation
Week 12 9th Apr. Dimensionality Reduction
Textbook and Reference
 Textbook:
 Introduction to Machine Learning (2nd Ed.), by Ethem
Alpaydin, The MIT Press, 2010.
 Reference:
 Pattern Classification (2nd Ed.), by Richard Duda, Peter
Hart, and David Stork, Wiley-Interscience, 2000.
 Introduction to Data Mining, by Pang-Ning Tan,
Michael Steinbach, and Vipin Kumar, Addison Wesley,
2005.
 Pattern Recognition and Machine Learning, by
Christopher M. Bishop, Springer, 2006.

34
Useful Resources: Datasets
 UCI Repository:
 http://www.ics.uci.edu/~mlearn/MLRepository.html
 Kaggle:
 http://www.kaggle.com/

35
Useful Resources: Libraries
 scikit-learn (Python):
 http://scikit-learn.org/stable/
 MALLET (Java)
 http://mallet.cs.umass.edu/
 Weka (Java)
 http://www.cs.waikato.ac.nz/ml/weka/
 Tensorflow:
 https://www.tensorflow.org/
 Pytorch:
 https://pytorch.org/
 Many other libraries on deep learning
• http://deeplearning.net/software_links/

36
Useful Resources: Conferences
 International Conference on Machine Learning (ICML)
 Neural Information Processing Systems (NIPS)
 Conference on Learning Theory (COLT)
 Uncertainty in Artificial Intelligence (UAI)
 International Conference on AI & Statistics (AISTATS)
 International Joint Conference on Artificial Intelligence
(IJCAI)
 AAAI Conference on Artificial Intelligence (AAAI)
 International Conference on Learning Representations
(ICLR)

37
Useful Resources: Journals
 Journal of Machine Learning Research (JMLR)
 Machine Learning (MLJ)
 IEEE Transactions on Pattern Analysis and
Machine Intelligence (TPAMI)
 IEEE Transactions on Neural Networks and
Learning Systems (TNNLS)
 Artificial Intelligence (AIJ)
 Journal of Artificial Intelligence Research (JAIR)

38
Detailed Project Description
 This is a group-based course project
 Each group consists of 4-5 members
 Each group can either choose one selected Kaggle
competition or choose one selected research topic
as the course project
 List of selected Kaggle competitions and
research topics will be provided in the tutorial of
Week 2

39
Programming Languages
 Programming Languages:
 Any programming language can be used, e.g.,
Matlab, Python, C/C++, Java, R, etc
 Any open-source ML toolbox can be used
 Note: for Kaggle competitions, directly
using the source codes released by
participants are not allowed (20% penalty
will be made if found)

40
General Information of Kaggle
Kaggle.com Participants

Training file(s), A predictive


contains 𝒙𝒙𝑖𝑖 , 𝑦𝑦𝑖𝑖 ′ 𝑠𝑠 model

Test file only Prediction file only


contains 𝒙𝒙𝑖𝑖 ′ 𝑠𝑠 contains 𝑦𝑦𝑖𝑖 ′ 𝑠𝑠

Leaderboard
41
Submission (Kaggle)
 Submitted files:
1. A project report
2. A presentation video
3. The final .cvs file of your prediction results submitted
to the specific completion in Kaggle you participate
4. Your source codes (with a readme file)
 Notes:
 The submitted .cvs is to double check whether the
reported results are correct
 The submitted source codes are to double check
whether they are just copied from the ones released by
some participants
42
Submission (Research)
 Submitted files:
1. A project report
2. A presentation video
3. Your source codes (with a readme file)
 Notes:
 The submitted source codes are to double check
whether the reported results are correct
 Report format: 12 point font, single space, 20-25 pages

43
Format of Report and Video
 Presentation video:
 To introduce your course project in a video of 10-15
minutes long
 You can use any tool to produce the video, e.g., simply
using PowerPoint or other advanced tools
 File size ≤ 8M
 Some examples for reference:
https://www.youtube.com/channel/UCSBrGGR7JOiSyzl60OGdKYQ
https://www.youtube.com/channel/UC_sfvZvvPUbOQhDs_cqlx_A
 Report format:
 12 point font, single space, 20-25 pages

44
Content of Project Report (Kaggle)
 Specific roles of each group member
 An evaluation score and ranked position of your prediction
results for the specific competition in Kaggle
 Problem statement (using your own words instead of copy-
and-paste from Kaggle)
 Challenges of the problem
 Your proposed solution in detail (preprocessing, feature
engineering/representation learning, methodologies, etc)
 Experiments to demonstrate why the solution you proposed
is appropriate to solve the problem using experiments
 Conclusion: what you have learned from the project
45
Content of Project Report (Research)
 Specific roles of each group member
 A review on the specific research topic
 Your new proposed method if applicable
 Comparison experiments on state-of-the-art methods
(and your proposed method if applicable)
 Analysis on pros and cons of the compared methods
 Conclusion: you own insights on the research project

46
Assessments on Project Reports
Kaggle competitions Research-based projects
 Leaderboard performance  Literature review
 Convincingness  Comparison analysis
 Solution novelty  Methodology novelty
 Writing  Writing

This assessment is to evaluate


whether the organization of report is
clear and easy to follow, whether the
report contains a lot of typos

47
Assessments – Kaggle
 Leaderboard Performance: though all the listed
Kaggle competitions are completed, you can still
submit your results to Kaggle to obtain an
evaluation score and a ranking position
 The performance assessment is based on the
relatively ranking of your results on the specific
competition (i.e., top 20%, top 40%, top 60%, top
80%, and top 100%)

48
Assessments – Kaggle (cont.)
 Solution Novelty: as on Kaggle.com, most
participants or winners may discuss their solutions
on the forums of the specific competitions.
 If you propose a new and effective solution, you can
get bonus. You are encouraged to propose your own
solutions based on your own understandings on the
competitions

49
Assessments – Kaggle (cont.)
 Convincingness: the goal of the project report is
to convince readers that your proposed solution is
proper to solve the specific machine learning task.
In your report, you need to conduct experiments to
verify your proposed ideas

50
Assessments – Research
 Literature review: as this is a research project,
figuring out what have been done in the literature
is important. You should provide a comprehensive
review on the specific research topic studied in
your project

51
Assessments – Research (cont.)
 Comparison Analysis: you need to implement
various state-of-the-art methods for the research
topic studied in your research project, and analyze
their cons and pros with your own insights

52
Assessments – Research (cont.)
 Methodology Novelty: if you propose a new and
effective method for the specific research topic,
even though it might be incremental, you can get
bonus. You are encouraged to propose your own
methods based on your understandings on the
research topic

53
Key Dates
 Sent information on group members and the
project via email:
 by 22nd Feb. 2019
 Submit files, i.e., the project report, video, source
codes, through NTULearn:
 by 11:59pm, 21st Apr. 2019

54
Thank you!

55

You might also like