Caltech's Machine Learning – CS 156 by Prof Yaser Abu-Mostafa


Overview – Machine Learning Course

Overview Lecture of Caltech’s Machine Learning Course – CS 156 by Professor Yaser Abu-Mostafa.

Lecture 01 – The Learning Problem

The Learning Problem – Introduction; supervised, unsupervised, and reinforcement learning. Components of the learning problem.

Lecture 02 – Is Learning Feasible

Is Learning Feasible? – Can we generalize from a limited sample to the entire space? Relationship between in-sample and out-of-sample.

Lecture 03 – The Linear Model I

The Linear Model I – Linear classification and linear regression. Extending linear models through nonlinear transforms.

Lecture 04 – Error and Noise

Error and Noise – The principled choice of error measures. What happens when the target we want to learn is noisy.

Lecture 05 – Training Versus Testing

Training versus Testing – The difference between training and testing in mathematical terms. What makes a learning model able to generalize?

Lecture 06 – Theory of Generalization

Theory of Generalization – How an infinite model can learn from a finite sample. The most important theoretical result in machine learning.

Lecture 07 – The VC Dimension

The VC Dimension – A measure of what it takes a model to learn. Relationship to the number of parameters and degrees of freedom.

Lecture 08 – Bias-Variance Tradeoff

Bias-Variance Tradeoff – Breaking down the learning performance into competing quantities. The learning curves.

Lecture 09 – The Linear Model II

The Linear Model II – More about linear models. Logistic regression, maximum likelihood, and gradient descent.

Lecture 10 – Neural Networks

Neural Networks – A biologically inspired model. The efficient backpropagation learning algorithm. Hidden layers.

Lecture 11 – Overfitting

Overfitting – Fitting the data too well; fitting the noise. Deterministic noise versus stochastic noise.

Lecture 12 – Regularization

Regularization – Putting the brakes on fitting the noise. Hard and soft constraints. Augmented error and weight decay.

Lecture 13 – Validation

Validation – Taking a peek out of sample. Model selection and data contamination. Cross validation.

Lecture 14 – Support Vector Machines

Support Vector Machines – One of the most successful learning algorithms; getting a complex model at the price of a simple one.

Lecture 15 – Kernel Methods

Kernel Methods – Extending SVM to infinite-dimensional spaces using the kernel trick, and to non-separable data using soft margins.

Lecture 16 – Radial Basis Functions

Radial Basis Functions – An important learning model that connects several machine learning models and techniques.

Lecture 17 – Three Learning Principles

Three Learning Principles – Major pitfalls for machine learning practitioners; Occam’s razor, sampling bias, and data snooping.

Lecture 18 – Epilogue

Epilogue – The map of machine learning. Brief views of Bayesian learning and aggregation methods.


Produced in association with Caltech Academic Media Technologies under the Attribution-NonCommercial-NoDerivs Creative Commons License (CC BY-NC-ND). To learn more about this license,…

This lecture was recorded on April 3, 2012, in Hameetman Auditorium at Caltech, Pasadena, CA, USA.

View course materials in iTunes U Course App –… and on the course website –

Leave a Reply