Artificial Intelligence No 30: How to understand the maths for data science – part two

Ajit Jaokar

Published Nov 16, 2021

Welcome to Artificial Intelligence #30

We are close to 33K subscribers in 6 months. Thanks for your support as ever.

This week, we launch the Digital Twins: Enhancing Model-based Design with AR, VR and MR (Online). The course is now oversubscribed. For a new (and very complex) topic, this is very nice to see. I am grateful for an amazing set of presenters which include

Dr Robbie Stevens - Alpine

Dr Dirk Hartmann – Siemens

Dr David McKee – Slingshot and Digital twins consortium

Dr Francesco Ciriello

Phil Chacko - Unity

Keith Myers

David Menard - Unity

And team our core team

Dr Lars Kunze , Ajit Jaokar and Ayşe Mutlu

Also, in December, we launch the Artificial Intelligence: Cloud and Edge Implementations

Now in its fourth year, it is a full stack AI course covering both the cloud and the edge and based on MLOps.

I have discussed before about how I use a maths foundation in this course and that you need to understand the maths behind AI and why many developers struggle with the mathematical foundations of AI

In this newsletter, I will expand my way of approaching maths for AI and I hope you can benefit from these insights for your own learning.

Its easy to write a long book on this subject – all you do is start from the basics of matrices, probability etc and that sounds complex. However, in my experience, it also puts people off. Firstly, you do not want to learn matrices and vectors.

What you want to learn is how these ideas apply to machine learning and deep learning i.e. you are interested in the contra question - Given a mathematical concept X which machine learning algorithms use it

For example,

Where are eigen vectors and eigen values used in machine learning?

The answer of course is PCA (see An intuitive understanding of eigenvectors is key to PCA)

So, here is my approach as it stands now

This could be useful for you to learn on your own also

Of course, I welcome any comments if I have missed anything

The approach is based on the following ideas:

Abstracting out common elements (in the foundations section)
View algorithms from multiple perspectives to understand the maths behind them
Finally separate the supporting topics (feature engineering, model evaluation etc)

This allows you to focus on the core i.e. the algorithms but also see how they interplay

I also include some references I like in the following outline

1) Introduction

Four areas of maths

linear algebra
Statistics theory
Probability
Optimization

The machine learning pipeline

Overview of algorithms

When to use which algorithm

A mapping of problems with algorithms

See this good ML cheat sheet from Microsoft

What not to study

i.e. to focus on a narrower set of topics

2) Foundations

Learning and functions

Understanding distributions

Design of experiments

Roger Peng’s - Executive data science is a good resource

Small data

Frequentist vs Bayesian

When machine learning meets complexity: why Bayesian deep learning is unavoidable

Are you a Bayesian or a frequentist

Statistical Inference

What is the meaning of statistical inference

Frequentist inference

p-value
Confidence interval
Null hypothesis
significance testing

Bayesian inference

Bayesian concept learning
Bayesian machine learning
Probabilistic graphical models
Bayesian decision theory

Estimation

Estimation theory
Parameter estimation
Maximum likelihood estimators
Bayes estimators
Least squares
Other estimators

Objective functions and Optimization

Why optimization is important in machine learning/

Supervised vs unsupervised

Stochastic vs deterministic

3) Machine learning and deep learning perspectives

Machine learning from traditional perspective

Linear regression
Improving linear regression (lasso, Ridge)
Logistic regression
GLM
Linear discriminant analysis
Naïve Bayes classifier
Non linear regression methods
non-linear classification models

Machine learning from a Bayesian perspective

Probabilistic Machine Learning: An Introduction – by Kevin Murphy

Statistical foundations of machine learning (2nd ed) - Gianluca Bontempi

Discriminative v.s. generative models

Parametric v.s. non parametric models

Parametric vs non parametric models

Non parametric models (Exemplar-based methods, Kernel method, Trees structures, Ensemble learning: bagging and boosting)

Deep learning algorithms

Core deep learning (MLP); deep learning for images(CNN); deep learning for sequences (LSTM)

A taxonomy of machine learning models

I discussed a taxonomy of algorithms based on Peter Flach book in a previous newsletter

4) Supporting topics

Feature engineering

(Feature Extraction, Feature Transformation, Feature Selection)

Model evaluation

Cross validation

Unsupervised learning

Welcome your thoughts

This week I read one of the best books on the future of AI.

The Technological Singularity by Murray Shanahan

In fact, I also like the MIT press essential knowledge series - very high quality but concise books

Finally, Artificial Intelligence: Cloud and Edge Implementations is almost getting full. If you are interested, please apply through the above link