DSA5105 Lecture1
DSA5105 Lecture1
DSA5105 Lecture1
Soufiane Hayou
Department of Mathematics
Logistics
• Instructor
Soufiane Hayou (Department of Mathematics)
Office: S17-05-08
Email: [email protected]
• Notes, slides and code are/will be available on Canvas
• Assessment
• Homework (3 x 5%) + Project (15%) (Individual) [30%]
• Mid-term Test [20%]
• Final Exam [50%]
Logistics
Lectures
• F2F (LT28) + Live(Zoom) + Recording will be uploaded
• You are encouraged to turn on your camera (if you’re
following with Zoom)
• Active participation is encouraged.
• Consultation arrangement will be announced later
Introduction
Can machines think?
I propose to consider the question, “Can
machines think?”. This should begin with
definitions of the meaning of the terms
"machine" and "think." The definitions might
be framed so as to reflect so far as possible
the normal use of the words, but this attitude
is dangerous…
8 ÷ 2( 2+2)
16
calc(*args)
8 ÷ 2( 2+2)
calc(*args)
16
Topics
• Supervised learning
• Linear and nonlinear models
• Basic learning and approximation theory
• Learning/optimization algorithms
• Unsupervised learning
• Dimensional reduction, clustering and generative models
• Reinforcement learning
• Markov decision processes, reinforcement learning
algorithms
What this class is
• A (hopefully) gentle introduction to the exciting world of
machine learning
• A holistic view of the modern interplay of machine
learning with mathematics, statistics, computer science,
physical sciences and engineering
Other examples
• Video captures
• Financial time series
• Numerical measurements from experiments
What about general discrete data?
We make an important distinction
• Ordinal data
Data that has a natural notion of order, e.g.
• Star ratings of a product
• Level of language proficiency
• Letter grades of a class
• Nominal data
Data that has no order, e.g.
• Categories of image classification
• Answers to True/False questions
We need to embed these discrete data into something we can
represent on a computer, e.g. real/floating point numbers
The types of embedding depends on the nature of the data!
• Ordinal data
We want embedding to preserve this ordering, so
we typically use real numbers
• Nominal data
This is somewhat opposite -- we want embedding
to not introduce spurious ordering, e.g. one-hot
embedding
Classes of machine learning problems
Supervised Learning Unsupervised Learning Reinforcement Learning
Regression Clustering Value iteration
Classification Dimensional reduction Policy gradient
Function approximation Generative models Actor-critic
Inverse problems/design Anomaly detection Exploration
… … …
Examples
• Image recognition
• Weather prediction
• Stock price prediction
• …
Given dataset:
Inputs: Outputs/labels: Data size:
Goal: learn the relationship from
𝑥1 =¿ Cat
𝑦𝑖
This is called empirical risk minimization (ERM)
So, is learning just optimization?
We want to do well on unseen data! In other words, our model
must generalize.
What we can solve
Empirical risk minimization Generalization Gap
^
𝑓
Population risk minimization
≠
~
𝑓
What we really want to solve
Three paradigms of supervised learning
𝓗
∗
i on 𝑓
a t
oxim
ppr
~A
𝑓
n
tio
za
𝑓0
al i
er
n
Ge
^
𝑓
Optimization
(using )
Linear Models
Simple linear regression
Solution:
and
in compact form
( )( )()
𝜙0 ( 𝑥 1 ) ⋯ 𝜙 𝑀 −1 ( 𝑥 1) ∨¿ 𝜙 𝑀 −1 ( 𝑥 2) 𝑤0 𝑦1
⋯
Φ= 𝜙0 ( 𝑥 2 ) 𝑤= 𝑤1 𝑦 = 𝑦2
⋮ ⋱ ⋮ ⋮ ⋮
𝜙0( 𝑥𝑁 ) ⋯ 𝜙 𝑀 −1 ( 𝑥 𝑁 ) 𝑤 𝑀 −1 𝑦𝑁
We want to solve
Rearranging we have
General Ordinary
Least Squares
Formula
regularizer
Types of regularization
• regularization: (ridge regression)
• regularization: (least absolute shrinkage and selection
operator, or lasso)
• …
Regularization and generalization
We apply regularization on the over-fitting examples
Recall:
so , but
kth Position
We require a slight change of hypothesis space