Machine Learning Course SS 13 U Stuttgart

See my general teaching page for previous versions of this lecture.

Exploiting large-scale data is a central challenge of our time. Machine Learning is the core discipline to address this challenge, aiming to extract useful models and structure from data. Studying Machine Learning is motivated in multiple ways: 1) as the basis of commercial data mining (Google, Amazon, Picasa, etc), 2) a core methodological tool for data analysis in all sciences (vision, linguistics, software engineering, but also biology, physics, neuroscience, etc) and finally, 3) as a core foundation of autonomous intelligent systems.

This lecture introduces to modern methods in Machine Learning, including discriminative as well as probabilistic generative models. A preliminary outline of topics is:

motivation
probabilistic modeling and inference
regression and classification methods (kernel methods, Gaussian Processes, Bayesian kernel logistic regression, relations)
discriminative learning (logistic regression, Conditional Random Fields)
feature selection
boosting and ensemble learning
representation learning and embedding (kernel PCA and derivatives, deep learning)
graphical models
inference in graphical models (MCMC, message passing, variational)
learning in graphical models

Students should bring basic knowledge of linear algebra, probability theory and optimization.

Organization

This is the central website of the lecture. Link to slides, exercise sheets, announcements, etc will all be posted here.
See the 01-introduction slides for further information.

Schedule, slides & exercises

date	topics	slides	exercises (due on 'date'+1)
08.04.	Introduction & Organization	01-introduction	(notation )
15.04.	Regression linear regression, non-linear features (polynomial, RBFs, piece-wise), regularization, cross validation, Ridge/Lasso, kernel trick	02-regression	e01-intro
22.04.	Classification classification, discriminative function, logistic regression, binary \& multi-class case, conditional random fields	03-classification	e02-linearRegression ../data/dataLinReg1D.txt ../data/dataLinReg2D.txt ../data/dataQuadReg1D.txt ../data/dataQuadReg2D.txt ../data/dataQuadReg2D_noisy.txt
29.04.	Classification (cont.)		e03-classification ../data/data2Class.txt ../data/digit_pca.txt ../data/digit_label.txt
13.05.	Breadth of ML ideas	04-ideas	e04-PCA-PLS ../data/yalefaces_cropBackground.tgz ../data/yalefaces.tgz
27.05.	Breadth of ML ideas (cont.)		e05-WEKA-boosting ../data/digits-train.arff.gz ../data/digits-test.arff.gz
03.06.	SVMs (by Vien Ngo)	05-vien-SVM	e06-SVM-NN
10.06.	Deep Learning Probability basics	04-ideas 06-BayesBasics	e07-BayesBasics
17.06.	Bayesian Regression & Classification	07-BayesianRegressionClassification	e08-GaussianProcesses
24.06.	Graphical Models	08-graphicalModels	e09-graphicalModels
01.07.	Inference in Graphical Models	09-graphicalModels-Inference	e10-inference
08.07.	Learning with Graphical Models	10-graphicalModels-Learning	e11-EM ../data/gauss.txt ../data/mixture.txt
15.07.	Summary	13-MachineLearning-script

Literature

[1] The Elements of Statistical Learning: Data Mining, Inference, and Prediction by Trevor Hastie, Robert Tibshirani and Jerome Friedman. Springer, Second Edition, 2009.
full online version available
(recommended: read introductory chapter)

[2] Pattern Recognition and Machine Learning by Bishop, C. M.. Springer 2006.
online
(especially chapter 8, which is fully online)

[email by Stefan Otte:] This is a nice little (26 pages) linear algebra and matrix calculus reference. It's used for the ML class in Stanford. Maybe it's interesting for your ML class. link

[email by Stefan Otte:] Feature selection, l1 vs. l2 regularization, and rotational invariance Paper: link Comments: link

[email by Stefan Otte:]

ich habe vor kurzem einen sehr guten Google Tech Talk zum Thema
Ensembles gesehen. In dem Talk "The Counter-Intuitive Properties of
Ensembles for Machine Learning, or, Democracy Defeats Meritocracy"
argument W. Philip Kegelmeyer (vereinfacht gesagt), dass man fuer
Supervised Learning Ensembles benutzen soll. Vll. ist das fuer den ein
oder anderen Studenten von Interesse.

Hier ein paar meiner Notizen:
- Boosting: overfitting, sensitive to outliers.
- "Ensembles of experts": diversity of experts --> diversity in error
--> robustness/no overfitting
- "Out of Bag validation" (OOB) to determine ensemble size (vs.
learning the weights for the voting (which does not scale))
- unstable classifiers (e.g. decision trees) are a good fit for ensembles
- decision trees without pruning work well with ensembles. (pruning is
normally expensive!)
- "Ensembles of bozos": LOTS of bozos which train on tiny subsets (1%)
of the data
- traditional < experts < bozos
- training bozos is faster than training one traditional sage!


http://csmr.ca.sandia.gov/~wpk/
http://csmr.ca.sandia.gov/~wpk/slides/avatar-ensembles.pdf
http://csmr.ca.sandia.gov/~wpk/avi/avatar-tools-background-video.avi


Beste Grüße,
 Stefan

Was ist Informatik? – Eine andere Sichtweise

Die gängigen Erklärungen zu “Was ist Informatik?” – etwa von der Gesellschaft für Infomatik, der TU Dresden, oder auf Wikipedia – machen es einem schwer, sic...

Machine Learning Course SS 13 U Stuttgart

Recent Posts

Was ist Informatik? – Eine andere Sichtweise