Coronavirus (Covid-19): Latest updates and information
Skip to main content Skip to navigation

CS909 Data Mining


Welcome to CS909/CS429 Data Mining in Term-2 of 2021! Please follow this website for updates and any module announcements! We will also be using the mailing list on Tabula to contact you -make sure you are registered for the module.

Covid-19 Updates:

This page will be updated as soon as information is received from the university/department. Please take care of yourself! You are recommended to keep yourself updated through as the student newsletter.

General Information

Instructor: Fayyaz Minhas

Teaching Fellow: Greg Watson ( [Primary Point of Contact in case of logistics/scheduling or related issues]

Teaching Assistants:

Ali Mohammadi-Shanghooshabad
John Pocock
Meghdad Kurmanji
Srijay Deshpande
Zheng Fang


Lab Sessions - Weeks 1,3,5,7,9

Each student has been allocated a lab session. Please check your scheduled lab session on Tabula and attend that.

NOTE: The following information may change if we are able to return to face-to-face teaching.

Lab Sessions - Weeks 2,4,6,8,10

NOTE: The following information may change if we are able to return to face-to-face teaching.

Lab Instructions

Online attendance in these lab sessions is obligatory and their purpose is to help you with assessed assignments and provide oral feedback. Please make sure that you attend the whole of an assigned lab session so that you do not miss attendance. You can ask questions to TAs in the teams meeting by typing "I need help" or tagging the respective TA. A TA will then provide 1-1 help in a direct Teams session.

It is recommended that you keep yourself muted and keep your videos off during Teams meetings/lab sessions and not interfere with the setup or functioning of the lab meetings in any way.

Moodle Link (for QA & Discussions)

We shall be using THIS moodle page for our module.

Ahead of each synchronous session, I would create a post in the moodle to welcome you to post your questions about the recording lectures. You can post these questions by the end of the Thursday so that I can address these questions in the synchronous session.

I will also be posting a series of questions on Moodle each week as well that you can answer for self-assessment and feedback.

Please use moodle for non-urgent communication. We will not be responding to lab or synchronous lecture teams sessions after the sessions have concluded and will not be monitoring those.


  • Assignment-1 (25% of final grade): See this link. Due date: 17 Feb. 2020 by 12 noon UK time
  • Assignment 2 (worth 25% of the final mark for MEng and 35% of the final mark for MSc): TBA by the mid of the module and due by the end of the module.
  • There will be a final exam in the module


    [PML] Probabilistic Machine Learning: An Introduction by Kevin Patrick Murphy. MIT Press, 2021. link: 

    [IML] Introduction to Machine Learning 3e by Ethem Alpaydin (selected chapters: ch. 1,2,6,7,9,10,11,12,13)

    [DBB] Deep Learning by Ian Goodfellow, Yoshua Bengio, Aaron Courville, (Ch 1-5 if needed as basics), Ch. 6,7,8,9 link:

    [FNN] Fundamentals of Neural Networks : Architectures, Algorithms And Applications by Laurene Fausett, (ch. 2,6)

    Course Materials

    Slides and reading materials will be posted on or before Wednesday each week. The lab session will be available prior to the start of the lab session.

    Course Stream Channel






    Jan 11

    Introduction (stream, YT)

    Why Data Science? (stream)

    Applications (stream)

    Research Applications (stream)



    Synchronouse Session Recording

    Questions by Students
    Follow-up Questions

    Introduction Slides

    Applications and Framework Slides

    k-Nearest Neighbor Algorithm [Required]

    [PML] Chapter-1, [IML] Chapter-1

    CRISPR Talk

    Whole Slide Images are Graphs Talk

    Project Suggestions

    The master algorithm (casual reading)

    A few useful things to know about machine learning

    Learning Python


    Jan 18

    Synchronous Session Recording

    Classification and Linear Discriminants

    Determining Linear Separability

    Prelim: Gradients and Gradient descent

    Prelim: Gradient Descent Code

    Prelim: Convexity

    Perceptron Modeling

    Perceptron Code

    Linear Discriminants (notes)

    Preliminaries (notes)

    Building Linear Models (notes)

    Gradient Descent Code (py)

    Perceptron Code (py)

    Perceptron Algorithm

    Self-Assessment Exercise Questions

    Post your Questions

    Implementing kNN classifier


    Jan 25

    3.1 What's special in an SVM?

    3.2 SVM Formulation

    3.3 A brief history of SVMs

    3.4 Coding an SVM and C

    3.5 Margin and Regularization

    3.6 Linear Discriminants and Selenite Crystals

    3.7 Selenit Crystals bend Space

    3.8 Using transformations to fold

    3.9 Transformations change distance and dot products

    3.10 Kernelized SVMs

    Self Assessment Exercise

    Post your Questions

    SVM Notes

    SVM Applet

    Regularized Perceptron

    Transformations code

    Fold and Cut Theoreom

    Book Reading [SVM in PML, SVM in IML]

    SVM Tutorial

    Gradient Descent and Perceptron

    (see lectures from previous week)


    Assignment-1 Announced


    Feb 1



    Work on Assignment-1

    Learning Python

    The following resources may be useful when familiarising yourself with Python.

    Python Documentation:

    NumPy Website:

    Matplotlib Website:

    Video Tutorials

    1. Introduction
    2. Basic Variables
    3. Lists
    4. Control Flow
    5. Functions
    6. Tuples
    7. Sets
    8. Dictionaries
    9. NumPy
    10. Matplotlib
    11. Classes