2020/2021(Terms 1 and 2): The colloquia seminar will happen on Microsoft Teams.
- If you have ideas for possible speakers, then please email one of the organisers above.
2020/21 Term 1:
3pm-4pm on Wednesday 9th December
Title: On Spectral Graph Clustering
Abstract: I will present our 2019 "Two Truths" PNAS paper. Clustering is a many-splendored thing. As the ill-defined cousin of classification, in which the observation to be classified X comes with a true but unobserved class label Y, clustering is concerned with coherently grouping observations without any explicit concept of true groupings. Spectral graph clustering – clustering the vertices of a graph based on their spectral embedding – is all the rage, and recent theoretical results provide new understanding of the problem and solutions. In particular, we reset the field of spectral graph clustering, demonstrating that spectral graph clustering should not be thought of as kmeans clustering composed with Laplacian spectral embedding, but rather Gaussian mixture model (GMM) clustering composed with either Laplacian or Adjacency spectral embedding (LSE or ASE); in the context of the stochastic blockmodel (SBM), we use eigenvector CLTs & Chernoff analysis to show that (1) GMM dominates kmeans and (2) either LSE nor ASE dominates, and we present an LSE vs ASE characterization in terms of affinity vs core-periphery SBMs. Along the way, we describe our recent asymptotic efficiency results, as well as an interesting twist on the eigenvector CLT when the block connectivity probability matrix is not positive semidefinite. (And, time permitting, we will touch on essential results using the matrix two-to-infinity norm.) We conclude with a ‘Two Truths’ LSE vs ASE spectral graph clustering result – necessarily including model selection for both embedding dimension & number of clusters – convincingly illustrated via an exciting new diffusion MRI connectome data set: different embedding methods yield different clustering results, with one (ASE) capturing gray matter/white matter separation and the other (LSE) capturing left hemisphere/right hemisphere characterization.