CRiSM Seminar - Clifford Lam (LSE), Zoltan Szabo (UCL)

Location: D1.07 (Complexity)

Zoltán Szabó, (UCL)

Regression on Probability Measures: A Simple and Consistent Algorithm

We address the distribution regression problem: we regress from probability measures to Hilbert-space valued outputs, where only samples are available from the input distributions. Many important statistical and machine learning problems can be phrased within this framework including point estimation tasks without analytical solution, or multi-instance learning. However, due to the two-stage sampled nature of the problem, the theoretical analysis becomes quite challenging: to the best of our knowledge the only existing method with performance guarantees requires density estimation (which often performs poorly in practise) and the distributions to be defined on a compact Euclidean domain. We present a simple, analytically tractable alternative to solve the distribution regression problem: we embed the distributions to a reproducing kernel Hilbert space and perform ridge regression from the embedded distributions to the outputs. We prove that this scheme is consistent under mild conditions (for distributions on separable topological domains endowed with kernels), and construct explicit finite sample bounds on the excess risk as a function of the sample numbers and the problem difficulty, which hold with high probability. Specifically, we establish the consistency of set kernels in regression, which was a 15-year-old-open question, and also present new kernels on embedded distributions. The practical efficiency of the studied technique is illustrated in supervised entropy learning and aerosol prediction using multispectral satellite images. [Joint work with Bharath Sriperumbudur, Barnabas Poczos and Arthur Gretton.]


Clifford Lam, (LSE)

Nonparametric Eigenvalue-Regularized Precision or COvariance Matrix Estimator for Low and High Frequency Data Analysis

We introduce nonparametric regularization of the eigenvalues of a sample covariance matrix through splitting of the data (NERCOME), and prove that NERCOME enjoys asymptotic optimal nonlinear shrinkage of eigenvalues with respect to the Frobenius norm. One advantage of NERCOME is its computational speed when the dimension is not too large. We prove that NERCOME is positive definite almost surely, as long as the true covariance matrix is so, even when the dimension is larger than the sample size. With respect to the inverse Stein’s loss function, the inverse of our estimator is asymptotically the optimal precision matrix estimator. Asymptotic efficiency loss is defined through comparison with an ideal estimator, which assumed the knowledge of the true covariance matrix. We show that the asymptotic efficiency loss of NERCOME is almost surely 0 with a suitable split location of the data. We also show that all the aforementioned optimality holds for data with a factor structure. Our method avoids the need to first estimate any unknowns from a factor model, and directly gives the covariance or precision matrix estimator. Extension to estimating the integrated volatility matrix for high frequency data is presented as well. Real data analysis and simulation experiments on portfolio allocation are presented for both low and high frequency data.

