Computer Science Colloquium
Organiser: Tom Gur
Colloquia alternate between online and face-to-face formats. Additional titles and abstracts will be added as details become available.
Current term
Third term, 2021-2022:
26 May 202216:00 - 17:00 |
Title: Reproducibility in Learning Abstract: Reproducibility is vital to ensuring scientific conclusions are reliable, but failures of reproducibility have been a major issue in nearly all scientific areas of study in recent decades. A key issue underlying the reproducibility crisis is the explosion of methods for data generation, screening, testing, and analysis, where only the combinations producing the most significant results are reported. Such practices can lead to erroneous findings that appear to be significant, but don’t hold up when other researchers attempt to replicate them. In this talk, we introduce a new notion of reproducibility for randomized algorithms. This notion ensures that with high probability, an algorithm returns exactly the same output when run with two samples from the same distribution. Despite the exceedingly strong demand of reproducibility, there are efficient reproducible algorithms for several fundamental problems in statistics and learning. We will introduce these problems and the techniques used to design reproducible algorithms for them. Finally, we discuss connections to other well-studied notions of algorithmic stability, such as differential privacy. This talk is based on joint work with Russell Impagliazzo (UCSD), Rex Lei (UCSD), and Toniann Pitassi (Columbia University), to appear in STOC 2022. |
9 June 202214:00 - 15:00 |
Title: Extending Generalization Theory Towards Addressing Modern Challenges in Machine Learning
Abstract: Recent years have witnessed tremendous progress in the field of Machine Learning (ML). However, many of the recent breakthroughs demonstrate phenomena that lack explanations, and sometimes even contradict conventional wisdom. One main reason for this is because classical ML theory adopts a worst-case perspective which seems too pessimistic to explain practical ML: in reality data is rarely worst-case, and experiments indicate that often much less data is needed than predicted by traditional theory. In this talk we will discuss two variations of classical learning theory. These models are based on a distribution- and data-dependent perspective which complements the distribution-free worst-case perspective of classical theory, and is suitable for exploiting specific properties of a given learning task. A common theme of these models is their combinatorial nature. This can be seen as a continuation of the fruitful link between machine learning and combinatorics, which goes back to the discovery of the VC dimension more than 50 years ago. Based on joint works with Noga Alon, Olivier Bousquet, Steve Hanneke, Ron Holzman, Ramon van-Handel, and Amir Yehudayoff |
TBD |
Gil Cohen (Tel Aviv University) Title and abstract to be announced shortly |
Past terms
Second term, 2021-2022:
13 January 202214:00 - 15:00 |
Title: Complexity of total search problems Abstract: There are many theorems of the form "for all, there exists" (such as all games have a Nash equilibrium, or all numbers have a prime factorisation). Bio: Since 2013 I have been a professor at the Dept of Computer Science, University of Oxford. Previously I was a professor at the Dept of Computer Science, University of Liverpool, where I moved in 2006. I was founding head of the ECCO research group. At both Oxford and Liverpool, I’ve worked extensively on algorithmic game theory and related problems in algorithms and computational complexity. Prior to this I was at the Department of Computer Science, University of Warwick, from 1997. I have also worked at Aston University and Sandia National Labs (working on algorithmic problems from computational biology), USA. PhD (Edinburgh, ’92) in computational learning theory; BA in mathematics (Oxford, ’88). |
10 February 202214:00 - 15:00 |
Yves-Alexandre de Montjoye (Imperial) Title: Using Data while Protecting Privacy in the Digital Era Abstract: We live in a time when information about most of our movements and actions is collected and stored in real time. The availability of large-scale behavioural data dramatically increases our capacity to understand and potentially affect the behaviour of individuals and collectives. The use of this data, however, raises legitimate privacy concerns. Anonymization is meant to address these concerns: allowing data to be fully used while preserving individuals' privacy. In this talk, I will first describe a line of research in attacks against de-identified datasets showing how traditional data protection mechanisms mostly fail to protect people's privacy in the age of big data. I will then describe what I see as a necessary evolution of the notion of data anonymization towards an anonymous use of data and discuss the pros and cons of some of the modern privacy engineering techniques currently developed ranging from Differential Privacy to Query-Based Systems. I will conclude by describing how, when combined, I think they will allow large-scale behavioural data to be used while giving individuals strong privacy guarantees. Bio: Yves-Alexandre de Montjoye is an Associate Professor at Imperial College London. He currently is a Special Adviser on AI and Data Protection to EC Justice Commissioner Reynders and a Parliament-appointed expert to the Belgian Data Protection Agency (APD-GBA). In 2018-2019, he was a Special Adviser to EC Competition Commissioner Vestager co-authoring the Competition Policy for the Digital Era report. Yves-Alexandre worked for the Boston Consulting Group and acted as an expert for both the Bill and Melinda Gates Foundation and the United Nations. He received his PhD from MIT in 2015 |
10 March 202214:00 - 15:00 |
Title: Programming for our Future -- The role of computer science in climate science Abstract: Climate science and computer science are both relatively young fields of study that essentially "grew up" together; some of the first Bio: Dominic Orchard is a researcher in both the theory and practice of |
17 March 202214:00 - 15:00 |
Title: The Limits of Symmetric Computation Abstract: The most famous open problem in theoretical computer science, Bio: Anuj Dawar is Professor of Logic and Algorithms at the University of Cambridge, where he has worked since 1999. He has degrees from IIT, Delhi, the University of Delaware and obtained his PhD at the University of Pennsylvania. Before moving to Cambridge, he worked in Swansea. His research focuses on investigating computational complexity through the lens of logic. He served for five years as president of the European Association for Computer Science Logic. He has chaired (among others) the committees that award the Ackermann award, the Gödel prize, the Barry Cooper prize and the IPEC-Nerode award. |
First term, 2021-2022:
07 October 202114:00 - 15:00 |
Mohamed Abdelfattah (Cornel Tech) Title: Rethinking Deep Learning Computations: From AutoML to Hardware Abstract: The explosion and availability of data has created a new computation paradigm based on deep learning. Consequently, we need to rethink both software and hardware to make these computations possible and to keep up with the ever-increasing computation demand. In this talk, I will describe how we use automated machine learning (AutoML) to enable on-device AI through neural architecture search, compression, hardware modelling and efficient search algorithms. Next, I will give an overview of my experience in designing hardware and compilers for deep learning acceleration – I will focus on a new automated co-design methodology that simultaneously improves efficiency and accuracy. Finally, I will focus on reconfigurable devices like FPGAs. I will describe how an embedded network-on-chip can transform FPGAs into a general-purpose computation platform well-suited for deep learning. Bio: Mohamed is an incoming assistant professor of ECE at Cornell University and Cornell Tech. He is currently a research team lead at the Samsung AI Center in Cambridge UK, working on the codesign of deep learning algorithms and hardware. Before that, he was at Intel building an FPGA-based accelerator and compiler for deep neural networks. Mohamed did his PhD at the University of Toronto, during which he was awarded the Vanier Canada Graduate scholarship and three best paper awards for his work on embedded networks-on-chip for FPGAs. |
21 October 202114:00 - 15:00 |
Title: Learning to Schedule Abstract: In this talk, I will present new results for two scheduling problems with unknown parameters that require using a learning algorithm along with a scheduling policy. The first problem is for a multi-class, multi-server queueing system with stochastic rewards of assignments of jobs to servers, with mean rewards depending on the features of jobs and servers according to a bilinear model. The goal is to maximize the expected reward of assignments while maintaining the queueing system stable. The second problem is scheduling jobs with stochastic holding costs with unknown mean values in a single server system. The goal in this problem is to minimize the expected total job holding cost. For both these problems, I will present new algorithms and regret bounds. The motivating application scenarios for these problems arise in data processing platforms, crowdsourcing systems, and other online platforms. Joint work with Jung-Hun Kim and Dabeen Lee Bio: Milan Vojnovic is a Professor of Data Science, with the Department of Statistics at LSE. He was a Researcher at Microsoft Research from 2004 to 2016. He received a Ph.D. degree EPFL in 2003. He was an affiliated lecturer at the University of Cambridge, from 2014 to 2016. He held a visiting researcher position with Facebook from 2019 to 2021. Milan’s research interests are in optimisation, machine learning, and design of data-driven algorithms. His research work was recognised by various awards including a 2010 ACM Sigmetrics Rising Star Researcher award and a 2005 ERCIM Cor Baayen Award, and several best paper awards at various conferences. He received a Facebook Faculty Research Award in 2019 and a Criteo Faculty Research Award in 2018. |
04 November 202117:00 - 18:00 |
Divyakant Agrawal (UC Santa Barbara) Title: On the power of summary statistics in large-scale data management Abstract: During the past two decades we have seen an unprecedented increase in the amount of data that is being generated from numerous internet-scale applications. Before this data can be subject to modelling and analysis, it becomes necessary to obtain summary statistics such as the cardinality of unique visitors, frequency counts of users from different states or countries, and in general finding the quantile and median information from the dataset. Efficient algorithms exist for computing the exact information over the data. Unfortunately, these algorithms require a considerable amount of time, scanning the data multiple times, or require additional storage that is linear to the size of the dataset itself. Approximation methods, with guaranteed error bounds, developed in the context of streaming data are extremely effective to extract useful and relatively accurate knowledge from big data. In this talk, we will start with a review of one such approximation technique for estimating frequency counts over streaming data where only insert operations are permitted. We will then present a refinement of this technique in the context of large datasets that need to handle both insert and delete operations. |
25 November 202114:00 - 15:00 |
Claus Brabrand (ITU Copenhagen) Title: How to Increase the Diversity in Computer Science Abstract: Women are widely underrepresented in Computing studies. |
Third term, 2020-2021:
13 May 202114:00 - 15:00 |
Title: Optimising Computer Systems in Complex and Dynamic Parameter Space Abstract: Performance tuning of computer systems is challenging for a variety of reasons. Modern computer systems expose many configuration parameters in a complex and massive parameter space. The systems are nonlinear and there is no method for quantifying or modelling such systems by performance tuning to the level of precision required. Auto-tuning has emerged using a black-box optimiser such as Bayesian Optimisation (BO). However, BO has limited scalability. Reinforcement Learning (RL) could be applied for combinatorial optimisation problems, but there is a gap between current research and practical RL deployments. I will introduce our framework to tackle these issues and demonstrate the potential of machine learning based methodologies for computer system optimisation. Bio: Eiko Yoneki is a senior researcher in the Systems Research Group of the University of Cambridge Computer Laboratory and a Turing Fellow at the Alan Turing Institute. Her research interests span distributed systems, networking and databases, including large-scale graph processing. Her group’s current research focusses on auto-tuning of data processing/analytics framework to deal with complex parameter space using ML. |
27 May 202114:00 - 15:00 |
Title: Constructive Cryptography Abstract: Modularization is a key principle in any constructive discipline. One wants to obtain complex constructions as the composition of simpler Bio: Ueli Maurer is Full Professor of Computer Science at ETH Zurich. His research interests include the theory and applications of cryptography, information security, theoretical computer science, information theory, and discrete mathematics. One of his long-term research goals is to establish a constructive theory of cryptography and to apply it to the modular design of provably-secure cryptographic protocols. He has served as Editor-in-Chief of the Journal of Cryptology from 2002 to 2010. He is an IEEE Fellow, an ACM Fellow, an IACR Fellow, and a member of the German Academy of Sciences (Leopoldina). He received the 2013 Vodafone Innovation Award for Mobile Communications, the 2016 RSA Award for Excellence in Mathematics, and the 2016 Theory of Cryptography (TCC) Test-of-Time Award. Maurer has served many companies and start-ups as board member and consultant and has advised many government organsations. In 2018 he co-founded the Concordium project and foundation aiming at creating a global, universally trusted, provably incorruptible and secure transaction platform. |
24 June 202114:00 - 15:00 |
Title: Stop talking to me (again): some communication-avoiding programming patterns Abstract: In this talk, we generalise the term communication-avoiding. Beyond its well-established meaning from linear algebra, we make it comprise (i) the reduction of data volume, (ii) the elimination of (meta) data generation, (iii) the reduction of data exchange frequency, (iv) the homogenisation of data access, (v) data access hiding and (vi) the localisation of data transfers. These criteria apply to both classic data exchange between compute nodes as well as data movements on the chip. Communication-avoiding then tackles the problematic divergence sketched above. Bio: Tobias Weinzierl is Professor in the Department of Computer Science at Durham University, Durham, UK. He has served at the Munich Centre for Advanced Computing (see the Springer edited book, Advanced Computing) before, and holds a PhD and habilitation from the Technical University Munich. |
Second term, 2020-2021:
21 January 202114:00 - 15:00 |
Arthur Gervais (Imperial College; Liquidity Network) Title: Flash Loans for Fun and Profit
Abstract: Credit allows a lender to loan out surplus capital to a borrower. In the traditional economy, credit bears the risk that the borrower may default on its debt, the lender hence requires upfront collateral from the borrower, plus interest fee payments. Due to the atomicity of blockchain transactions, lenders can offer flash loans, i.e., loans that are only valid within one transaction and must be repaid by the end of that transaction. This concept has lead to a number of interesting attack possibilities, some of which were exploited in February 2020. In this talk we will explore the implication of transaction atomicity and flash loans for the nascent decentralized finance (DeFi) ecosystem. We will analyze two existing attacks with ROIs beyond 500k% and formulate finding the attack parameters as an optimization problem over the state of the underlying Ethereum blockchain and the state of the DeFi ecosystem. We will show how malicious adversaries can efficiently maximize an attack profit and hence damage the DeFi ecosystem further. Specifically, we will present how two previously executed attacks can be “boosted” to result in a profit of 829.5k USD and 1.1M USD, respectively, which is a boost of 2.37× and 1.73×, respectively. Bio: Arthur Gervais is a Lecturer (equivalent Assistant Professor) at Imperial College London, and the founder of Liquidity Network. He's passionate about information security and worked since 2012 on blockchain related topics, with a recent focus on Decentralized Finance (DeFi). |
4 February 202114:00 - 15:00 |
Title: Local computation algorithms Abstract: Consider a setting in which inputs to and outputs from a computational problem are so large, that there is not time to read them in their entirety. However, if one is only interested in small parts of the output at any given time, is it really necessary to solve the entire computational problem? Is it even necessary to view the whole input? We survey recent work in the model of "local computation algorithms" which for a given input, supports queries by a user to values of specified bits of a legal output. The goal is to design local computation algorithms in such a way that very little of the input needs to be seen in order to determine the value of any single bit of the output. Though this model describes sequential computations, techniques from local distributed algorithms have been extremely important in designing efficient local computation algorithms. In this talk, we describe results on a variety of problems for which sublinear time and space local computation algorithms have been developed -- we will give special focus to finding maximal independent sets and generating random objects. Bio: Ronitt Rubinfeld is a Edwin Sibley Webster Professor of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology, and is on the faculty at the University of Tel Aviv. Her research interests include sublinear time algorithms, local algorithms and algorithms for testing discrete distributions over large domains. Ronitt Rubinfeld has been an ONR Young Investigator, a Sloan Fellow, an invited speaker at the Internal Congress of Mathematics (2006) and an ACM Fellow. |
15 March 202115:00 - 16:00 |
Title: Abstract: Bio: |
First term, 2020-2021:
29 October 202014:00 - 15:00 |
Title: A primer on PAC-Bayesian learning, followed by News from the PAC-Bayes frontline Abstract: PAC-Bayes is a generic and flexible framework to address generalisation abilities of machine learning algorithms. It leverages the power of Bayesian inference and allows to derive new learning strategies. I will briefly present the key concepts of PAC-Bayes and illustrate a few of its recent successes, for deep neural networks, for coherent risk measures, contrastive learning and robust learning. Bio: Benjamin Guedj is a Principal Research Fellow (~Associate professor) at University College London (Centre for Artificial Intelligence and Department of Computer Science). He is also a tenured research scientist at Inria, the top French research institute in mathematics and computer science, and a visiting researcher with the Alan Turing Institute. Since Sept. 2020, he is the founder and scientific director of The Inria London Programme, a joint lab between France and the UK. He holds a PhD in mathematics from Sorbonne Université (Paris, France) and focuses on statistical learning theory, PAC-Bayes, computational statistics, theory for deep learning, among other topics. |
19 November 202017:00 - 18:00 |
Rocco Servedio (Columbia University) Title: Testing noisy linear functions for sparsity Abstract: This talk explores a question at the intersection of two well-studied topics: (1) sparse recovery, and (2) property testing. Consider the following basic sparse recovery problem: An algorithm is given noisy input-output examples of a linear function over a high-dimensional space. Given the promise that the linear function is sparse, breakthrough work of Candes, Romberg and Tao shows that the linear function can be recovered from a number of examples that scales logarithmically in the dimension (and this dependence is required). We look at this problem from the vantage point of property testing, by studying the complexity of merely determining whether the unknown linear function is sparse versus "far from" sparse. We show that this decision problem can be solved with a number of samples which is completely independent of the dimension if the background high-dimensional distribution of samples has i.i.d. components that are not Gaussian. Bio: Rocco Servedio is a professor and chair of the Department of Computer Science at Columbia University. His research is in theoretical computer science, and his main research interests lie in computational complexity theory, computational learning theory, property testing, and the role of randomness in computation. |
26 November 202014:00 - 15:00 |
John Daugman (University of Cambridge) Title: Neural Computing, Identity, and Visual Recognition of Persons Abstract: Brains and computers have long been used as metaphors for each other. In recent years the link has grown stronger, with the explosion of results in machine learning based on big multi-layer networks, and with data from wet neurobiology showing that each cubic millimeter (mm^3) of brain tissue contains about 3 kilometers of "wiring" (neural connections). This talk discusses vision both in machines and in mammals, focusing on pattern encoding mechanisms and their efficiency. The main application areas discussed are face recognition and the IrisCode, which has recently been used to enroll the entire 1.3 billion population of India in a national ID system. Unifying theoretical ideas are entropy, and visual encoders that are optimal under the Uncertainty Principle. Bio: John Daugman received his degrees at Harvard University and then taught at Harvard before coming to Cambridge University, where he is Professor of Computer Vision and Pattern Recognition. His areas of research include computer vision, information theory, neuro computing and statistical pattern recognition. Awards for his work in science and technology include the Information Technology Award and Medal of the British Computer Society, the "Time 100" Innovators Award, and the OBE, Order of the British Empire. He is the founder and benefactor of the Cambridge Chrysalis Trust. |
10 December 202014:00 - 15:00 |
Ava Khamseh (University of Edinburgh) Title: Higher-order interactions in statistical physics, machine learning and biomedicine |
Third term, 2019-2020:
2 July 202014:00 - 15:00 |
Title: Blockchain Scalability and Privacy via Scalable and Transparent zk-STARKs
Abstract: Two of the biggest challenges of blockchains like Bitcoin and Ethereum are (1) scalability - growing thoughput 10x, 100x, 1000x, ... and (2) maintaining privacy. Scalable ZK proof systems can solve both problems. This talk will recount the theory-to-practice journey of ZK proofs and describes questions in algebraic coding theory whose resolution will help improve scalability and privacy even more. Bio: Eli is a co-founder and president of StarkWare, and Chairman of its Board of Directors. Eli has been passionate about the theory (“moon math”) and realization of transparent computational integrity since 2001, when he was a post-doctoral researcher at MIT and Harvard University, after completing his PhD in CS at the Hebrew University. Prior to co-founding StarkWare, Eli was a Professor of Computer Science at Technion – Israel Institute of Technology, which he joined in 2005. He is also a co-inventor of the Zerocash decentralized anonymous payment system and a founding scientist of the Electric Coin Company (aka the Zcash company). |
18 June 202015:00 - 16:00 |
Title: Are All Features Created Equal? |
Second term, 2019-2020:
16 January 202014:00 - 15:00 MSB2.22 |
Title: Automated Fact Checking: a natural language processing perspective Abstract: |
4 February 202010:00 - 11:00 |
Hanan Samet (University of Maryland) -- ACM Distinguished Speaker Title: Reading News with Maps by Exploiting Spatial Synonyms Abstract: |
6 February 202014:00 - 15:00MSB2.22 |
Title: Multiparty Privacy and AI Security & Privacy Abstract: |
27 February 202014:00 - 15:00 |
Title: Modulated Bayesian Optimisation Abstract: |
12 March 202014:00 - 15:00 |
Title: Testing Discrete Distributions -- a CS Retrospective Abstract: Making inferences about the distributional properties of data is arguably one of the most fundamental classes of data analysis tasks. The study of such inferences has been a scientific endeavour for a long time, typically, by statistics and information theory. In the last two decades, testing problems for distributions, especially, on discrete domains have been studied also in the theoretical computer science community, often from a slightly different perspective. The primary objective in this line of research is characterising the number of samples needed, in terms of the domain size or some function of the distributions involved, to perform a given inference task reliably. In this talk, I will present an overview of this very active field of research, describing some basic models and problems, landmark results, and future directions. |
First term, 2019-2020:
10 October 201914:00 - 15:00 |
Sunil Prabhakar (Purdue University) |
24 October 201914:00 - 15:00 |
Evangelia Kalyvianaki (Cambridge)Title: THEMIS -- Fairness in Federated Stream Processing under Overload Abstract: |
7 November 201914:00 - 15:00 |
Ata Kaban (Birmingham)Title: Uncovering structure with randomness for learning in high dimensions Abstract: |
21 November 201914:00 - 15:00 |
Samson Abramsky (Oxford)Title: Non-classicality, quantum resources and quantum advantage (slides) Abstract: Our focus here is on quantum resources which take the form of non-local, or more generally \emph{contextual}, correlations. Contextuality is one of the key signatures of non-classicality in quantum mechanics and has been shown to be a necessary ingredient for quantum advantage in a range of information processing tasks. We will describe a notion of simulation between quantum resources, and more generally between resources described in terms of contextual correlations, in the ``sheaf-theoretic'' framework for contextuality (Abramsky-Brandenburger). The talk will be essentially self-contained, introducing the various notions we will discuss. |
3 December 201915:00 - 16:00 |
Jakob Nikolas Kather (University Hospital Aachen)Title: Deep Learning for Precision Oncology Abstract: |
Third term, 2018-2019:
Thu 2 May, '1914:00 - 15:00 |
Sarah Meiklejohn (UCL)Title: Anonymity in Cryptocurrencies
(This talk will take place in CS1.04)
Abstract: A long line of recent research has demonstrated that existing cryptocurrencies often do not achieve the level of anonymity that users might expect they do, while at the same time another line of research has worked to increase the level of anonymity by adding new features to existing cryptocurrencies or creating entirely new cryptocurrencies. This talk will explore both of these lines of research, demonstrating both de-anonymization attacks and techniques for anonymity that achieve provably secure guarantees. |
Thu 9 May, '1914:00 - 15:00 |
Edith Elkind (Oxford)Title: Fair Division of a Graph Abstract: We consider fair allocation of indivisible items under an additional constraint: there is an undirected graph describing the relationship between the items, and each agent's share must form a connected subgraph of this graph. This framework captures, e.g., fair allocation of land plots, where the graph describes the accessibility relation among the plots. We focus on agents that have additive utilities for the items, and consider several common fair division solution concepts, such as proportionality, envy-freeness and maximin share guarantee.In particular, we prove that for acyclic graphs a maximin share allocation always exists and can be computed efficiently. We also discuss a continuous variant of this problem, where agents need to divide a collection of interconnected "cakes" and each agent's piece has to be connected, or, more generally, have a small number of connected components. Based on joint work with Sylvain Bouveret, Katarina Cehlarova, Ayumi Igarashi, Dominik Peters, Xiaohui Bei and Warut Suksompong |
Thu 6 Jun, '1914:00 - 15:00 |
Andreas Damianou (Amazon, Cambridge)(This talk will take place in CS1.04) Title: Probability and uncertainty in Deep Learning Abstract: |
Thu 20 Jun, '1914:00 - 15:00 |
Mario Berta (Imperial College London)Title: Quantum Technologies for Cryptography Abstract: |
Thu 11 Jul, '1911:00 - 12:00 |
Radu Calinescu (University of York)Title: Learning, synthesis and efficient analysis of probabilistic models Abstract: |