Machine Learning is the science of how we can build abstractions of the world from data. In this unit we will start with the fundamental underlying principles and philosophies that allows us to learn and then look at how we can formulate these using explicit models. We will then look at how we can fit these models to data in order to "learn". The unit is not aimed at giving you a set of tools that you can use to do machine learning but rather give you an understanding so that you can yourself design, choose and understand the right tools to solve real problems.
Machine learning is mathematical in nature and a good grasp of linear algebra and multi-variate calculus is recommended in order to fully digest the material. There is lots of good material on the internets that you can brush up your skill with. Particularly good is the upcoming (free) book Mathematics for Machine Learning. If you feel that it was a long time since you did maths or feel a bit rusty, I recommend that you have a look through this book. A lot of the work we do is with matrices, a very good resource for matrix algebra is the The Matrix Cookbook. It doesn't really provide any understanding but if you need a resource to quickly look-up the derivative this is the place to start.
Material
You can download all the material for the course by checking out the following GitHub repo. Keep up to sync with this repository to access the lectures/coursework and additional material. I am unlikely to keep the links on this website up to date so its the material in the repository that counts. The blackboard site is not used for anything other than to access the video recordings of the lectures. If you want to plan ahead you can look at previous years lecture material which you can access by changing branch to 2017
and 2018
respectively.
There is a reddit feed associated with the unit. The idea is for this forum to be the main form of interaction outside of lectures. Due to the size of this unit I am not able to interact over email so please direct your questions to the reddit forum rather than mail. This is a good place to discuss with each other, provide comments to me and importantly it is completely anonymous. Please try to answer each others questions and discuss with each other. If you have comments on the material, suggestions etc. the are very appreciated. Don't worry about expressing your thoughts even if they are critical, it only shows that you care and want the unit to be better.
Ethnography study
According to Wikipedia Ethnography is defined as "Ethnography is the systematic study of people and cultures. It is designed to explore cultural phenomena where the researcher observes society from the point of view of the subject of the study."
Machine learning have and is expected to have an ever greater impact on society, but this interaction goes both ways, society also effects machine learning as a topic. I am very excited to have ethnography PhD student Kate Byron as part of the team this year. Kate is doing research on how social and cultural norms shape machine learning practices. During the unit Kate will run a focus group to discuss the topic and conduct interviews regarding these questions. I think this is a great chance for anyone who is interested not only in the technical details but the "bigger picture" of the topic in society. If you are interested in getting involved please contact Kate Byron. Kate will also give a lecture towards the end of the unit on the ethical implications of machine learning.
Lectures
We have two lectures of 1h each week, Monday 12-13 and Tuesday 17-18 both in the Queens Building lecture theatre 1.40. In total there is 10 weeks of taught material with a break for a reading week during week 7 and new for this year we do not have any teaching week 12 and term finishes week 11.
- Introduction
- Probabilities
- Distributions
- Derivation of Gaussian Identities
- Linear Regression
- Dual Linear Regression
- Gaussian Processes
- Unsupervised Learning
- Bayesian Optimisation
- Evidence and Graphical Models
- Dirichlet Processes
- Topic Models
- Neural Networks
- Reinforcement Learning
- Laplace Approximation
- Sampling
- Variational Inference
- Derivation of Variational Bound Potts Model
- Ethical Dilemmas and Machine Learning
- Summary Lecture
Last year I wrote a summary document of the lectures that can be found here this document is summarising the key points of each lecture and is a good starting point for testing your more abstract understanding of each of the topics. This document will be update during the unit so make sure that you are in sync with the repository. If you want to prepare for the lectures having a read of this document is a good idea.
Practical
Each week there is a 2h lab session where we will go implement the methods that we talk about during the lectures. Importantly, the worksheets are there for you to digest the material better and are just a suggestion you are the ones that are responsible for finding the best way of learning the material. The language of choice for machine learning is currently Python, therefore during the first week we will do a quick introduction Python and the main libraries of code that we will use.
- Introduction to Python/Numpy
- Distributions and Probabilities
- Linear Regression
- Gaussian Processes
- Unsupervised Learning
- Evidence
- Bayesian Optimisation
- Sampling
- Variational Inference
- Tensorflow
The labs are not assessed your grade will be determined by an exam in the end of the unit. Importantly the labs are designed to be the material that you need to do in order to understand prepare yourself for the exam.
Material
We will use Bishop, C. M., Pattern recognition and machine learning (2006) for most parts of this unit. The book is an excellent introduction to machine learning and one of the few that provides a consistent and rigorous narrative rather than falling into the trap and becoming a cookbook for practitioners. The book is freely availible here. As the book is rather old we will for a few of the topics use other material that is freely available.