Skip to main content Skip to navigation

WPCCS 2023 Schedule

The Warwick Postgraduate Colloquium in Computer Science (WPCCS) 2023 will be held on Friday 24th March 2023, in the Mathematical Sciences Building, University of Warwick. If you have any problems/questions with the scheduling of your presentation, please get in touch with

Update: We have gained access to the room MSB0.08, so all of the Theory sessions have moved to it, as well as the Machine Learning + AI (Problem Exploration) session. Also, due to technical issues, all sessions that were going to be in MSB2.23 are now in MSB2.22.

8:30-9:00 Registration
9:00-9:30 Guest Talk: Dr Rhiannon Falconmore (MSB0.07)
  Theory and Foundations
Natural Language Processing
Machine Learning

Kneser graphs are Hamiltonian
Namrata (online)

BTM: Bert based Topic Modelling without Bag-of-words information
Zheng Fang (online)

Machine Unlearning Unbound
Meghdad Kurmanji (cancelled)


On Parallel k-Center Clustering
Sam Coy (in-person)

Machine Learning to Tackle Fake News
John Dougrez-Lewis (online)

Developing Resource Discovery Frameworks for Machine Learning Tasks in Large-scale Mobile Computing Systems
Xuanyu Liu (online)


Dynamic Matching with Better-than-2 Approximation in Polylogarithmic Update Time
Peter Kiss (in-person)

Incorporating Knowledge to Event Temporal Relation via Bayesian Learning
Xingwei Tan (in-person)

SDS-Inf: A Serverless System for Distributed Sparse Deep Neural Network Inference
Joe Oakley (in-person)


Using Counter Machines to Find Cliques and Cycles in Graphs
Henry Sinclair-Banks (in-person)

Using topic modeling and Computerised Grounded Theory to understand collective digital leadership
Daniela Valdes (in-person)

Distinguishability Calibration to In-Context Learning
Hanqi Yan (in-person)


History-Deterministic One-Counter Nets
Aditya Prakash (in-person)

PHEE: A Dataset for Pharmacovigilance Event Extraction from Text
Zhaoyue Sun (in-person)

D3-GNN: Dynamic Distributed Dataflow for Streaming Graph Neural Networks
Rustam Guliyev (in-person)


Algorithmic information and firing philosophers (Problem Exploration)
Abhimanyu Pallavi Sudhir (in-person)

Document-Level Multi-Event Extraction with Proxy Nodes
Xinyu Wang (in-person)

A Bi-directional Convolutional Transformer Framework for Video Anomaly Detection
Guodong Shen (in-person)

11:00-11:15 Coffee Break
  Theory and Foundations
High Performance Computing and Programming Languages
Machine Learning + AI

Verifiable Differential Privacy
Ari Biswas (in-person)

This is Driving Me Loopy: Transforming `loop` in Arrowized Functional Reactive Programs
Finnbar Keating (in-person)

An Efficient Personalized Federated Learning Algorithm
Yuchen Liu (online)


Differentially Private Measures of Statistical Heterogeneity for Vectors
Mary Scott (in-person)

Communication-avoiding Optimizations for Large-scale Unstructured-mesh Applications with OP2
Suneth Ekanayake (in-person)

Rethink Model Compressibility in Zero-Cost Proxy Metrics
Lichuan Xiang (online)


Fast Dynamic k-Clustering
Martin Costa (in-person)

OP-PIC – An Unstructured Mesh Particle in Cell Domain Specific Language
Zaman Lantra (in-person)

CANIFE: Crafting Canaries for Empirical Privacy Measurement in Federated Learning
Samuel Maddock (in-person)


Distribution-Free Property Testing
Hugo Aaronson (in-person)

Performance Portable Multiphase Flow Solutions with Discontinuous Galerkin Methods
Toby Flynn (in-person)

The Importance of Incorporating Data Context When Developing XAI Frameworks
Saif Anwar (in-person)

12:15-12:30 -

Code Translation to High-Level Abstraction FPGA Design of Stencil Computation (Problem Exploration)
Beniel Thileepan (in-person)

Invariant Lipschitz Bandits: A Side Observation Approach
Nam Tran (in-person)

12:30-13:30 Lunch
13:30-14:00 Guest Talk: Dr Tom Gur (MSB0.07)
  Theory and Foundations
ML in Biology

Power of 2 choices with Comparison Error
Sanidhay Bhambay (in-person)

Order oriented transaction fee mechanism
Mohammad Nourbakhsh (in-person)

Supervised Machine Learning Model for Diabetic Students’ Glucose Levels Classification System
Mona Alotaibi (online)


Removing Memory from the Shared Memory Switch
Amelia Chui (in-person)

Security Analysis of mobile Point-of-Sale (mPoS) Terminals
Mahshid Mehr Nezhad (in-person)

Different cortical connectivities in human females and males relate to differences in physical workload and body composition, reward and emotional systems, and memory
Ruohan Zhang (cancelled)


An Exercise in Tournament Design: When Some Matches Must Be Scheduled
Peter Strulo (in-person)

Developments in Self-Enforcing E-Voting
Luke Harrison (online)

Weakly Supervised Learning for Predicting Viral Oncoprotein in Nasopharyngeal Carcinoma
Made Satria Wibawa (in-person)

14:45-15:00 -

A Heterogeneous Redundant Architecture for Industrial Control System Security
Zhihao Dai (in-person)

Spatial-temporal graph convolutional neural networks on evolving cell surfaces
Edward Offord (in-person)

15:00-15:15 Coffee Break
    Machine Learning + AI (Problem Exploration)
Machine Learning (Problem Exploration)

Network dissection in Diffusion Model
Yu Cao (online)

Multi-Camera Multi-Object Trajectory Forecasting (MCTF)
Amey Noolkar (in-person)


Fault Tolerant Task Offloading in Edge-Based IoT networks, A Multi-armed Bandit Approach.
Alya Alhazmi (in-person)

Explaining predictions of Neural Network with hidden paradigms
Bhushan Atote (in-person)


Research in Federated Graph Neural Networks
Yan Qian (in-person)

Machine Learning for Histopathology Images
George Wright (in-person)


Monocular Depth Estimation (MDE)
Bashayer Abdallah (in-person)

Source Code Plagiarism with Supervised Machine Learning: Open Issues and Challenges
Fahad Ebrahim (in-person)


Exploration in NAS (Neural Architecture Search) techniques
Minghao Xu (in-person)

Towards Efficient Large-Scale Federated Learning
Sara Alosaime (online)


What would Galileo think of Neural Architecture Search?
Rosco Hunter (in-person)

Optimal Transport for Causal Abstractions
Giorgos Felekis (cancelled)

16:15-16:30 Closing including Prize Announcement


Dr Rhiannon Falconmore (Vector Informatik) - The evolution of a PhD student: Academia to Industry (top)

From project inception to the viva and beyond, the path of each PhD student is completely unique, rewarding, daunting and especially non-linear. This talk covers the timeline of my academic evolution from 2017 (beginning of PhD) until now, including all the research I undertook, the ups (paper acceptances, conference experiences), the downs (burnout, COVID), why I chose to move into industry and what I do now (still a lot of research!).

Dr Tom Gur (Warwick) - How to write a research statement (top)

The importance of writing a strong research statement cannot be overstated. Yet despite the career-changing implications of these documents, there is often not enough discussion and guidance on how to approach writing them. In this talk, I will discuss how to write a strong and effective research statement and how to update and use it throughout one's academic career.

Theory and Foundations

Kneser graphs are Hamiltonian (top)

by Namrata

For integers k = 1 and n >= 2k+1, the Kneser graph K(n,k) has as vertices all k-element subsets of an n-element ground set, and an edge between any two disjoint sets. It has been conjectured since the 1970s that all Kneser graphs admit a Hamilton cycle, except the Petersen graph K(5,2). The main contribution of this paper is to prove the conjecture in full generality. We also extend this Hamiltonicity result to all connected generalized Johnson graphs J(n,k,s) that has as vertices all k-element subsets of an n-element ground set, and an edge between two sets whose intersection has size exactly s. Our results imply that all families of vertex-transitive graphs defined by intersecting set systems are Hamiltonian, which settles an interesting special case of Lov\'asz' conjecture on Hamiltonicity in vertex-transitive graphs from 1970.

On Parallel k-Center Clustering (top)

by Sam Coy

We consider the classic k-center problem in a parallel setting, on the low-local-space Massively Parallel Computation (MPC) model. As a central clustering problem, the k-center problem has been studied extensively. Still, until very recently, all parallel MPC algorithms have been requiring at least Ω(k) local space per machine. While this setting covers the case of small values of k, for a large number of clusters these algorithms require large local memory, making them poorly scalable. The case of large k has been considered recently for the low-local-space MPC model by Bateni et al. (2021), who gave an O(log⁡log⁡n)-round MPC algorithm that produces k(1+o(1)) centers whose cost has a multiplicative approximation of O(log⁡log⁡log⁡n). In this work, we extend the algorithm of Bateni et al. and design a low-local-space MPC algorithm that in O(log⁡log⁡n) rounds returns a clustering with k(1+o(1)) clusters that is an O(log⁡^*n)-approximation for k-center.

Dynamic Matching with Better-than-2 Approximation in Polylogarithmic Update Time (top)

by Peter Kiss

We present dynamic algorithms with polylogarithmic update time for estimating the size of the maximum matching of a graph undergoing edge insertions and deletions with approximation ratio strictly better than 2. Specifically, we obtain a 1+12√+ϵ≈1.707+ϵ approximation in bipartite graphs and a 1.973+ϵ approximation in general graphs. We thus answer in the affirmative the major open question first posed in the influential work of Onak and Rubinfeld (STOC'10) and repeatedly asked in the dynamic graph algorithms literature. Our randomized algorithms also work against an adaptive adversary and guarantee worst-case polylog update time, both w.h.p. Our algorithms are based on simulating new two-pass streaming matching algorithms in the dynamic setting. Our key new idea is to invoke the recent sublinear-time matching algorithm of Behnezhad (FOCS'21) in a white-box manner to efficiently simulate the second pass of our streaming algorithms, while bypassing the well-known vertex-update barrier.

Using Counter Machines to Find Cliques and Cycles in Graphs (top)

by Henry Sinclair-Banks

Consider a directed graph equipped with counters that can be incremented and decremented over an edge. Given two nodes p and q, the reachability problem asks whether there is a path from p to q in which all counters start at zero, all counters remain non-negative throughout, and all counters end at zero. I will show that one can construct instances of the reachability problem that equate to an instance of finding a clique or to an instance of finding a cycle in a graph.

History-Deterministic One-Counter Nets (top)

by Aditya Prakash

On models of computation that have a finite global control (such as finite state automata or pushdown automata), there is a sharp contrast between nondeterministic and deterministic models. While deterministic models are more efficient for verification and synthesis, they are often less succinct or less expressive than their nondeterministic counterpart. As such, many intermediaries between deterministic and nondeterministic models have been studied, with the aim of getting the best of both worlds within certain contexts. A notion that has been quickly gaining interest over the recent years in this regard is that of history-determinism. A non-deterministic automaton is said to be history-deterministic if it is possible to resolve the non-deterministic choices in the automaton 'on-the-fly’, based on the input that’s read so far. In this talk, I will introduce history-determinism and explore it in the setting of one-counter systems. This is joint work with K.S. Thejaswini.

Algorithmic information and firing philosophers (Problem Exploration) (top)

by Abhimanyu Pallavi Sudhir

I explain why algorithmic information is at the root of all our problems: bounded rationality, game and market dynamics, market failures and macroeconomic problems, the efficient market hypothesis, decision theory, extrapolated volition, thermodynamics. A full understanding of algorithmic information means some notion of what it means for an algorithm to "rationally" update on algorithmic information (i.e. which algorithms should it trust) and how idealized probabilistic beliefs and Bayes-rational expected utility maximization emerge in some limit.

Verifiable Differential Privacy (top)

by Ari Biswas

Differential Privacy (DP) is often presented as a strong privacy-enhancing technology with broad applicability and advocated as a de-facto standard for releasing aggregate statistics on sensitive data. However, in many embodiments, DP introduces a new attack surface: a malicious entity entrusted with releasing statistics could manipulate the results and use the randomness of DP as a convenient smokescreen to mask its nefariousness. Since revealing the random noise would obviate the purpose of introducing it, the miscreant may have a perfect alibi. To close this loophole, we introduce the idea of Verifiable Differential Privacy, which requires the publishing entity to output a zero-knowledge proof that convinces an efficient verifier that the output is both DP and reliable. We also demonstrate that computational assumptions are necessary by showing a separation between information-theoretic DP and computational DP under our definition of verifiability

Differentially Private Measures of Statistical Heterogeneity for Vectors (top)

by Mary Scott

This work serves as an exploration of existing and emerging research in Federated Machine Learning systems, in particular those which incorporate differential privacy and/or characterise statistical heterogeneity. Statistical heterogeneity is a term that can be used to describe a dataset if the observed trends from each sample differ from that of the overall population more than would be expected due to random error alone. The possible adaptations of existing measures of skewness have been examined to select a promising candidate that can intelligently select a sample from a statistically heterogeneous dataset. The accuracy of the trends inferred from that sample is then optimised and compared to those of the dataset. The eventual outcome is to carry out some federated learning or analytics on realistic, statistically heterogeneous data in a differentially private way, and compare its effectiveness with previous methods.

Fast Dynamic k-Clustering (top)

by Martin Costa

We present a $O(1)$-approximate fully dynamic algorithm for the $k$-median and $k$-means problems on metric spaces with amortized update time $\tilde O(k)$ and worst-case query time $\tilde O(k^2)$, achieving better update time than the current state-of-the-art by Henzinger and Kale [ESA'20]. Our algorithm is based on a dynamization of the work of Mettu and Plaxton [UAI'02]. We also provide a lower bound for dynamic $k$-median which shows that any $O(1)$-approximate algorithm with $\tilde O(poly(k))$ query time must have $\tilde \Omega(k)$ amortized update time, even in the incremental setting, showing that our update time is optimal up to polylogarithmic factors.

Distribution-Free Property Testing (top)

by Hugo Aaronson

Property testers are a class of sublinear algorithms, algorithms that are so efficient that they do not read the entire input to gain some information. For such a tester, we hope to distinguish between when an input has a property or is far from a property. This will be a discussion of a natural extension of this field to the distribution-free setting. The notion of learning distribution-free is a fundamental cornerstone of the probably approximately correct learning model. In this talk I will discuss this topic and demonstrate a lower bound for testing a specific class of properties.

Power of 2 choices with Comparison Error (top)

by Sanidhay Bhambay

We consider a system of $n$ identical servers where each server processing jobs at rate $\mu=1$. In the standard \textit{Power-of-2} or Po2 scheme, $2$ servers are sampled uniformly at random and an incoming job is assigned to the least loaded server. However, assigning an incoming job to the least loaded sampled server requires that the load comparison result is always accurate which may not be true in practice. In this work, we analyze the Po2 scheme under the model with noisy comparison results. Specifically, we consider the model where an arrival joins the least loaded sampled server with probability $1-\epsilon$ for $\epsilon\in (0,1)$. For this model we have shown that the system is stable for the range of arrival rate $\lambda < \min(1,\frac{1}{2\epsilon})$. Moreover, we characterise that the performance gain over the random scheme is exponential when $\lambda \to 1^+$ and is given by $1/\log(2-2\epsilon)$.

Removing Memory from the Shared Memory Switch (top)

by Amelia Chui

Network switch problems model the arrival and transmission of data packets in an online setting: irrevocable decisions to discard packets must be made when too many have been accumulated and capacity restrictions are reached. The shared memory switch is such a network switch problem for which we know of only one effective algorithm and its gradually improving competitiveness bounds, but it has been long suspected that it is indeed the best algorithm. We explore this hypothesis in the restricted setting of memoryless algorithms, in which decisions must be made based only on the current internal state of the network switch, without any knowledge of past decisions. We also consider the power of non-determinism in this context and whether it is indeed sufficient to gain a competitive edge.

An Exercise in Tournament Design: When Some Matches Must Be Scheduled (top)

by Peter Strulo

In single-elimination tournaments, players play one-on-one matches with the winner proceeding to the next round until only one player remains. The problem of manipulating the outcome of the tournament by carefully choosing which opponents play each other in each round (the seeding) has been studied extensively. We introduce a new variant of this problem where the aim is to choose a seeding which results in certain desired matches being played, rather than a specific player winning. We obtain both hardness and tractability results: the problem is NP-hard in general but polynomial-time solvable when the input digraph modelling the pairwise results is acyclic.

Natural Language Processing

BTM: Bert based Topic Modelling without Bag-of-words information (top)

by Zheng Fang

With the development of neural topic models in recent years, topic modelling is playing an increasingly important role in natural language processing. However, existing topic models all rely on bag-of-words (BOW) information, either as training input or training target. This prevents topic models from capturing word order information in sentences and causes them to suffer from the out-of-vocabulary (OOV) problem, i.e. they cannot handle unobserved words in new documents. To address these issues, we developed a novel topic model based on the pre-trained language model BERT. The model can infer the topic distribution of a sentence without using any BOW information. Furthermore, the model can also infer the topic distribution for each word in a sentence. Unlike traditional neural topic models that use decoder matrix weights to extract the top words for each topic, we can extract the top words for each topic directly from the word topic distribution.

Machine Learning to Tackle Fake News (top)

by John Dougrez-Lewis

Fake news and online misinformation are highly problematic. We have developed several methods to automatically debunk Twitter rumours in a real-time setting. One approach is concerned with how the rumours are written, separating "what is said" from "how it is said". Another approach automatically retrieves evidence from the internet and determines the rumour's veracity by constructing a knowledge graph, drawing inference from the connections between multiple sources. Our approaches achieve state-of-the-art results for the PHEME dataset.

Incorporating Knowledge to Event Temporal Relation via Bayesian Learning (top)

by Xingwei Tan

Understanding event temporal relation in text is a challenging task that lacks human-annotated data. Existing models to extract temporal relations between events lack a principled method to incorporate external knowledge. We introduce Bayesian-Trans, a Bayesian learning-based method that models the temporal relation representations as latent variables and infers their values via Bayesian inference and translational functions. Compared to conventional neural approaches, instead of performing point estimation to find the best set parameters, the proposed model infers the parameters' posterior distribution directly, enhancing the model's capability to encode and express uncertainty about the predictions. Experimental results on the three widely used datasets show that Bayesian-Trans outperforms existing approaches for event temporal relation extraction. We additionally present detailed analyses on uncertainty quantification, comparison of priors, and ablation studies, illustrating the benefits of the proposed approach.

Using topic modeling and Computerised Grounded Theory to understand collective digital leadership (top)

by Daniela Valdes

Background The study of digital transformation in hospital boards in England is an under-studied area with only one study evaluating the Global Digital Exemplars programme and one other study manually reviewing the content of NHS Trust Boards. Phase 1 Objectives Support and expand (currently limited) methodological advances in the use of text mining methods and computerised grounded theory (CGT) in collective leadership research for a single, digitally mature NHS Hospital. Methods Mixed methods – Computerised text mining, document analysis and asynchronous meeting recording observation (when available). Topic modeling used to inform Computational Grounded Theory to identify themes. Results Text data of board packs and minutes has different semantic characteristics, with relatively limited reference to digital words. LDA modeling identified 8 topics in the board pack and I was able to identify ’digitally related’ themes. Discussion We look to discuss proposals for further refinement of the methodology.

PHEE: A Dataset for Pharmacovigilance Event Extraction from Text (top)

by Zhaoyue Sun

The primary goal of drug safety researchers and regulators is to promptly identify adverse drug reactions. Evaluating and monitoring drug safety (i.e., pharmacovigilance) involves analyzing an ever-growing collection of medical reports. In this scenario, facilitating analysis of such reports via automation has the potential to rapidly identify safety signals. Unfortunately, public resources for developing natural language models for this task are scant. We present PHEE, a novel dataset for pharmacovigilance comprising over 5000 annotated events from medical case reports and biomedical literature, making it the largest such public dataset to date. We describe the hierarchical event schema designed to provide coarse and fine-grained information about patients' demographics, treatments and (side) effects. Along with the discussion of the dataset, we present a thorough experimental evaluation of current state-of-the-art approaches for biomedical event extraction, point out their limitations, and highlight open challenges to foster future research in this area.

Document-Level Multi-Event Extraction with Proxy Nodes (top)

by Xinyu Wang

Document-level multi-event extraction aims to extract the structural information from a given document automatically. Most recent approaches usually involve two steps: (1) modeling entity interactions; (2) decoding entity interactions into events. However, such approaches ignore a global view of inter-dependency of multiple events. Moreover, an event is decoded by iteratively merging its related entities as arguments, which might suffer from error propagation and is computationally inefficient. We propose an alternative approach for document-level multi-event extraction with proxy nodes. The proxy nodes, representing pseudo-events, are able to build connections with other event proxy nodes, essentially capturing global information. It makes it possible to compare the similarity between the set of predicted events and the set of ground-truth events. By directly minimizing the distance between the proxy node and labels, the model is trained towards the global optimum directly, which improves performance and reduces training time.

Machine Learning

Machine Unlearning Unbound (top)

by Meghdad Kurmanji

The problem of Deep Machine Unlearning refers to the process of eliminating the impact of a data cohort on the weights of a pre-trained deep neural network. This issue has garnered significant attention in recent times, due to the growing utilization of such models in applications that handle personal user data. The need to allow individuals to assert their "right to be forgotten" requires the development of efficient unlearning algorithms. Beyond privacy concerns, unlearning is also of practical importance for removing biases, outdated samples, outliers, and noisy labels in various applications. This paper presents a novel unlearning approach, referred to as SCRUB, and compares its performance with state-of-the-art (SOTA) methods through a comprehensive experimental evaluation. The results demonstrate that SCRUB consistently surpasses other algorithms in unlearning performance, as measured by three different metrics that reflect different use cases, without sacrificing the overall performance of the model.

Developing Resource Discovery Frameworks for Machine Learning Tasks in Large-scale Mobile Computing Systems (top)

by Xuanyu Liu

This research will focus on developing the machine learning and deep learning techniques for distributed mobile devices. In particular, the following research tasks are planned to conducted: 1) investigating a new resource discovery for distributed devices which could let the tasks automatically reach the equipment that can run themselves; 2) Integrating the resource discovery framework into the machine/deep learning models to enable their efficient training and inference in resource-limited mobile devices; 3) Implementing the resource discovery framework and machine learning models in a network of large scale distributed mobile devices.

SDS-Inf: A Serverless System for Distributed Sparse Deep Neural Network Inference (top)

by Joe Oakley

Serverless computing offers elasticity and cost-effectiveness, but challenges such as limited CPU/memory have limited its adoption for data-intensive applications such as ML inference. Without established inter-process communication (IPC) solutions such as MPI in the serverless domain, parallel computation with significant IPC requirements is challenging. We present SDS-Inf, the first fully serverless and highly scalable system for distributed ML inference. We outline two serverless communication schemes, leveraging cloud-based publish-subscribe/queueing and object storage offerings, and apply these to ML batch inference workloads in conjunction with FaaS compute. We also employ hypergraph partitioning to reduce communication overheads and achieve both (intra-layer) model and data parallelism. Through experiments on benchmark sparse DNNs, we show that leveraging publish-subscribe/queueing services for FaaS IPC yields competitive performance at significantly reduced cost, when compared to more common object storage solutions. Experiments confirm that our serverless ML inference solution is scalable and cost-effective, and can handle large distributed workloads.

Distinguishability Calibration to In-Context Learning (top)

by Hanqi Yan

Recent years have witnessed increasing interests in prompt-based learning in which models can be trained on only a few annotated instances, making them suitable in low-resource settings. However, PLMs built on the transformer architecture tend to generate similar output embeddings, making it difficult to discriminate between different class labels. In this work, we alleviate this information diffusion issue, i.e., different tokens share a large proportion of similar information after going through stacked multiple self-attention layers in a transformer, by proposing a calibration method built on feature transformations through rotation and scaling to map a PLM-encoded embedding into a new metric space to guarantee the distinguishability of the resulting embeddings. Furthermore, we take the advantage of hyperbolic embeddings to capture the hierarchical relations among fine-grained class-associated token embedding by a coarse-to-fine metric learning strategy to enhance the distinguishability of the learned output embeddings. Extensive experiments on the three datasets under various settings demonstrate the effectiveness of our approach.

D3-GNN: Dynamic Distributed Dataflow for Streaming Graph Neural Networks (top)

by Rustam Guliyev

Graph Neural Network (GNN) models on streaming graphs entail algorithmic challenges to continuously capture the dynamic state of the graph and incrementally update the model, as well as systems challenges to optimize latency, memory, and throughput during both inference and training. We present D3-GNN as the first distributed, hybrid-parallel, streaming GNN system that performs online predictions, incremental embedding computations, and training over real-time event streams. D3-GNN maintains the GNN model up-to-date with every change in the graph topology and features, and employs caching to efficiently capture cascading updates from the neighborhood aggregation step. We introduce streaming GNN aggregators which are feature-graph synopsis operators for incremental updates in a distributed fashion. A novel windowed forward pass solution is introduced to counteract the potential explosion of computational load of GNNs. D3-GNN is built on top of Apache Flink and is designed to benefit from the functionalities of the underlying fault-tolerant streaming middleware.

A Bi-directional Convolutional Transformer Framework for Video Anomaly Detection (top)

by Guodong Shen

While weakly-supervised and self-supervised approaches for video anomaly detection depend on the representativeness of training anomalous samples, we retrospectively investigate one-class classification and formulate a predictive framework that can generate accurate predictions for normal frames and flawed predictions for abnormal frames. Given the transformer’s strength in language translation, we develop a unique vision transformer, which inherits the original paradigm and behaviors, as our framework’s backbone with important modifications. Specifically, we convolutionalize the self-attention mechanism to handle high-dimensional frames and eliminate the positional encoding module after deeming it non-essential. Additionally, we introduce to its decoder a recurrent information bridge where low-level features can flow smoothly over time and space, hence strengthening predictions with fine details. Finally, to stabilize anomaly scores, we duplicate the decoding pipeline to extrapolate frames in reverse order and leverage the predicted middle frames to infer anomalies. Several experiments on public benchmarks confirm the efficacy of our framework.

High Performance Computing and Programming Languages

This is Driving Me Loopy: Transforming `loop` in Arrowized Functional Reactive Programs (top)

by Finnbar Keating

Arrowized Functional Reactive Programming (AFRP) is a useful paradigm for writing reactive programs: programs which take in inputs throughout their execution and return outputs at the same rate to react to their environment. Its usefulness comes from being embedded in Haskell, allowing programs to utilise Haskell's expressive type system to check for a variety of runtime errors - which is key in domains like robotics where errors can be costly. Unfortunately, performance issues of AFRP means that it does not see much use. One such issue is that its `loop` construct is defined in a way that requires the execution order of a program to be worked out at runtime. We avoid this particular overhead with a novel program transformation which turns `loop` constructs into a restricted form with known execution order. We discuss the performance benefits and prove the correctness of this transformation.

Communication-avoiding Optimizations for Large-scale Unstructured-mesh Applications with OP2 (top)

by Suneth Ekanayake

We investigate data movement-reducing and communication-avoiding (CA) optimizations and their practicable implementation for large-scale unstructured-mesh applications. Utilizing the high-level abstraction of the OP2 DSL for unstructured-mesh based codes, we reason about techniques for reduced communications across a consecutive sequence of loops - a loop-chain for distributed-memory parallelization. A new CA back-end for OP2 is designed with an extension to operate on a cluster of GPUs, such that these techniques can automatically be applied to any OP2 application. The new CA back-end is applied to the loop chains of two non-trivial applications, including the OP2 version of Rolls-Royce’s production CFD application, Hydra. Performance is investigated on both CPU and GPU clusters using 8M and 24M node mesh sizes. The results demonstrate 30–50% runtime reductions for select configurations. We model and examine the determinants and characteristics of a given unstructured-mesh loop-chain providing insights into the general feasibility and profitability of CA optimizations.

OP-PIC – An Unstructured Mesh Particle in Cell Domain Specific Language (top)

by Zaman Lantra

Particle-in-cell (PIC) calculation is a well-established procedure for modelling the behaviour of charged particles in the presence of electric/magnetic fields. Performance portability of PIC applications on emerging massively parallel heterogeneous HPC hardware have become an increasing concern for scientific organizations relying on these applications. One solution is the development of high-level abstractions such as a domain-specific-language (DSL) that allows domain specialists to specify the problem while leaving the implementation to a lower level through automatic code generation techniques, translating the specification to hardware specific parallelisations such as OpenMP, MPI, CUDA, etc. DSLs for PIC codes, especially unstructured-meshed PIC have not been well developed in the HPC community. In this work, we present a new DSL for unstructured PIC calculations, using the source-to-source translation and code generation techniques developed in the OP2 DSL (a DSL for unstructured-mesh algorithms), together with some preliminary results of single-node sequential, OpenMP and CUDA parallel versions.

Performance Portable Multiphase Flow Solutions with Discontinuous Galerkin Methods (top)

by Toby Flynn

In this work we present a performance portable solver toolkit based on the OP2 DSL, targeted at developing multiphase flow simulations. The new toolkit brings together the state-of-the-art in multiphase flow solutions based on Discontinuous Galerkin (DG) methods for achieving high-order accuracy within OP2's established performance portable framework. A key aspect of the modelling is how the fluid interface can be managed through high-order methods while maintaining the necessary stability, accuracy and performance. The new OP2-DG methods are used for developing a number of applications, including a non-trivial 2-fluid liquid whistle problem. The OP2-generated code for these problems are benchmarked, validated and compared to experimental results on a wide range of multi-core and many-core systems demonstrating near-optimal performance and scaling.

Code Translation to High-Level Abstraction FPGA Design of Stencil Computation (Problem Exploration) (top)

by Beniel Thileepan

Structured-mesh based stencil computations are a common motif appearing in many numerical algorithms. Recent work on FPGA has shown promising performance, both runtime and energy-efficient execution in this class of applications. FPGA vendors have introduced high-level synthesis tools (HLS) based on languages such as C/C++, OpenCL, and SYCL to ease the development of FPGA applications, which would otherwise require low-level RTL languages. However, there are still a significant amount of custom optimizations needed to obtain the best performance. Recent works have demonstrated how Domain Specific Languages (DSLs) are used to generate optimized implementations for various accelerator architectures. In this presentation, I will detail research into using DSLs to generate optimized FPGA code with an overview of the techniques to generate code using the OPS DSL and present some initial performance results on Xilinx FPGAs with comparison to hand-coded performance and performance from traditional architectures such as CPUs and GPUs.

Machine Learning + AI

An Efficient Personalized Federated Learning Algorithm (top)

by Yuchen Liu

Recently, a growing number of studies have pointed out that when the data of various participants are distributed in a Non-IID (non-independent identical distribution) manner, the performance of the Federated Learning (FL) global model is even worse than that of local models which were trained only by clients' local data. An intuitive, personalized Federated Learning (FL) algorithm based on partial model aggregation is proposed to address it. In this method, the model is simply divided into two parts. One part is responsible for participating in the global synchronization across the clients and sharing the common knowledge abstracted from clients. The other part is kept locally in the clients, and further learn personalized features from clients' local data. We systematically study the influence of different partition positions, i.e., the position where the model is partitioned. Then, we further proposed a method to determine the partition positions.

Rethink Model Compressibility in Zero-Cost Proxy Metrics (top)

by Lichuan Xiang

Inspired by zero-cost metrics for the pruning model at initialization, recent ‘zero-cost proxies’ methods have achieved promising neural architecture search performance with negligible computational cost. Regardless of the diverse ‘pruning’ metrics, the core idea is summing over all parameters’ saliency metrics score for the entire network. However, we observed that the saliency metrics dropped significantly before accuracy started dropping during the pruning process, leading to similar accuracy models having diverse metrics scores. Thus, we want to leverage the compressibility for the model at initialization to improve the correlation further.

CANIFE: Crafting Canaries for Empirical Privacy Measurement in Federated Learning (top)

by Samuel Maddock

Federated Learning (FL) trains machine learning models in distributed environments where clients do not share raw data but instead send model updates to a server. However, even model updates are subject to attack and can leak private information about local data. Differential Privacy (DP) is a leading mitigation strategy which involves adding noise to clipped model updates, trading off model performance for strong theoretical privacy guarantees. Previous work has shown the threat model of DP is conservative and that obtained guarantees may not directly translate into information leakage in practice. In this work, we aim to achieve a tighter measurement of the model exposure by considering a realistic threat model for the federated setting. We propose a novel attack method, CANIFE, that utilises ‘canaries’ – samples that are carefully crafted by a strong adversary and can be used to augment DP-FL training pipelines to evaluate empirical privacy.

The Importance of Incorporating Data Context When Developing XAI Frameworks (top)

by Saif Anwar

Trustworthiness of ML techniques is difficult to assess in high-risk or ethically sensitive scenarios. Many techniques using complex ML models are treated as a ‘black-box’ where the reasoning or criteria for the final decision is opaque to the user. Explainable AI (XAI) aims to ‘unbox’ complex models and provide reliable explanations to support adoption in high-risk fields such as healthcare and criminal justice. Some existing XAI approaches approximate model behaviour using perturbed data. However, these have been criticised for ignoring feature dependancies, with explanations being generated using potentially unrealistic data. This work shows how the absence of data context results in inaccurate and untrustworthy explanations. We proposes CHILLI, a novel approach for incorporating data context into XAI to improve both the soundness and accuracy of the explanations, ultimately increasing trust in both the explanation and the AI model.

Invariant Lipschitz Bandits: A Side Observation Approach (top)

by Nam Tran

Symmetry arises in many optimization and decision-making problems, and has attracted considerable attention from the optimization community: By utilizing the existence of such symmetries, the process of searching for optimal solutions can be improved significantly. Despite its success in (offline) optimization, the utilization of symmetries has not been well examined within the online optimization settings, especially in the bandit literature. As such, we study the invariant Lipschitz bandit setting, a subclass of the Lipschitz bandits where the reward function and the set of arms are preserved under a group of transformations. We introduce an algorithm named \texttt{UniformMesh-N}, which naturally integrates side observations using group orbits. We prove an improved regret upper bound, which depends on the cardinality of the group. We also prove a matching regret's lower bound for the invariant Lipschitz bandit class. We hope that our work will ignite further investigation of symmetry in sequential decision-making theory.


Order oriented transaction fee mechanism (top)

by Mohammad Nourbakhsh

In blockchain systems, miners are incentivized by block rewards and transaction fees to maintain the blockchain and secure the public ledger. Since the advent of Bitcoin, various mechanisms have been proposed to allocate transactions in blocks and pay miners for their efforts. However, the current state of transaction fee mechanisms (TFMs) has primarily focused on the inclusion of transactions in the block, neglecting the importance of their order. In blockchain systems that support smart contracts, the order of transactions can significantly impact the outcome for users. This effect has given rise to a class of attacks known as "order manipulation attacks," which undermine the security of blockchain systems. In response to these challenges, we present the first Order-Oriented Transaction Fee Mechanism (OTFM). This mechanism is designed to address the impact of ordering on users' profits and mitigate the consequences of order manipulation attacks.

Security Analysis of mobile Point-of-Sale (mPoS) Terminals (top)

by Mahshid Mehr Nezhad

The growth of mobile Point-of-Sale (mPoS) terminals has been driven by the increasing use of card-present (CP) transactions. These terminals, which can be used with a mobile phone, offer a convenient and low-cost way for merchants to process transactions. In this paper, we explore the security risks associated with mPoS terminals, with a focus on the role of the merchant's mobile phone in the mPoS ecosystem. Our analysis covers the security of the mobile phone's communication with the mPoS terminal and payment provider server, as well as the security risks in the mobile phone application. We demonstrated an eavesdropping attack to reveal cryptographic keys in the BLE communication, executed a Man-in-the-middle attack to tamper with mPoS terminal messages, and reverse-engineered the mobile phone application to disable security features. Based on these findings, we propose solutions to improve mPoS system security and mitigate security risks.

Developments in Self-Enforcing E-Voting (top)

by Luke Harrison

The term "self-enforcing e-voting" refers to the class of end-to-end (E2E) verifiable e-voting systems that require no dedicated tallying authorties (TAs) to verify the results of an election. This class of e-voting systems has been largely unexplored for the fairer and more representative voting methods, including Condorcet and Instant Runoff Voting (IRV), and largely untrialled for practical elections. In this talk, we review our developments within this space. We present VERICONDOR, our self-enforcing Condorcet e-voting system, and discuss our current progress in developing a self-enforcing e-voting system for IRV. The main novelty of VERICONDOR lies in our novel well-formedness verification that was previously unexplored in prior work. Additionally, we discuss the feasibility of deploying self-enforcing e-voting systems in practice, with reference to the recent trial of the DRE-ip voting system as part of the Durga Puja festival in India.

A Heterogeneous Redundant Architecture for Industrial Control System Security (top)

by Zhihao Dai

Component-level heterogeneous redundancy is gaining popularity as an approach for preventing single-point security breaches in Industrial Control Systems (ICSs), especially with regard to core components such as Programmable Logic Controllers (PLCs). Existing heterogeneous redundancy approaches advocate attack resilience via pairwise comparison among outputs from multiple PLCs. These approaches incur increased resource costs due to them having a high degree of redundancy and do not address concurrent attacks. In this paper we address both issues, demonstrating a data-driven component selection approach that achieves a trade-off between resources cost and security. In particular, we propose (i) a novel dual-PLC ICS architecture with native pairwise comparison which can offer limited yet comparable defence against single-point breaches, (ii) a machine-learning based selection mechanisms which can deliver resilience against non-concurrent attacks under resource constraints, (iii) a scaled up variant of the proposed architecture to counteract concurrent attacks with modest resource implications.

ML in Biology

Supervised Machine Learning Model for Diabetic Students’ Glucose Levels Classification System (top)

by Mona Alotaibi

Accurate and timely blood glucose prediction is essential for students who have been diagnosed with diabetes to prevent hypoglycemia and hyperglycemia episodes throughout the school days. Continuous Glucose Monitoring (CGM) is to manage blood sugar levels effectively which is a form of sensor that can be incorporated in the Internet of Things . In addition, machine learning (ML) models can precisely be used for predicting glucose levels using a CGM system. We aim to develop a decision support system based on ML algorithms and the IoT, for which we have applied Random Forest, Logistic Regression, AdaBoost (Ada), Multi-layer perceptron (MLP), and linear SVM algorithms, followed by a majority voting algorithm. Evaluation metrics such as the Confusion Matrix, Accuracy, Precision, F1 score and Recall, were implemented to evaluate the performance of each algorithm. The experimental results demonstrate that the accuracy of the majority voting model is 99.61%.

Different cortical connectivities in human females and males relate to differences in physical workload and body composition, reward and emotional systems, and memory (top)

by Ruohan Zhang

Gender differences in functional connectivity between females and males are well established, but what they relate to is unclear. In this investigation, we investigated this by identifying functional connectivity links that were different between females and males, and then analysing associations of these with behavioural measures. Most of the functional connectivities were lower in females, with the mean Cohen’s d = -0.18. The lower functional connectivities in females were especially of somatosensory/premotor regions including the insula, opercular cortex, paracentral lobule and mid-cingulate cortex, and were correlated with lower maximum workload, and with higher whole body fat mass. But some functional connectivities were higher in females, involving especially the ventromedial prefrontal cortex and posterior cingulate cortex, and these were correlated with higher liking for some rewards such as sweet foods, higher happiness/subjective well-being, with better memory-related functions.

Weakly Supervised Learning for Predicting Viral Oncoprotein in Nasopharyngeal Carcinoma (top)

by Made Satria Wibawa

Nasopharyngeal carcinoma (NPC) is a type of cancer originating in the nasopharynx, often associated with Epstein-Bar virus infection. This virus encodes several viral oncoproteins, such as Latent Membrane Protein (LMP1) which may promote NPC progression. Because of this nature, LMP1 is an excellent therapeutic for EBV-associated NPC. However, the detection of LMP1 on NPC with standard testing may cause additional costs and delays the diagnosis. In this study, we utilised haematoxylin and eosin-stain (H&E) histology images which is more viable. Using deep learning-based method, we attempt to predict the LMP1 status of NPC. We utilised an attention mechanism with residual connection in our weakly supervised learning. In our 3-fold cross-validation experiment, we achieved average accuracy, AUC and F1-score 0.936, 0.995 and 0.862, respectively. This method allows us to examine model interpretability. To the best of our knowledge, this is the first attempt to predict LMP1 status on NPC.

Spatial-temporal graph convolutional neural networks on evolving cell surfaces (top)

by Edward Offord

Many cellular processes involve complex deformations of the cell surface that change over time, which are difficult to automatically detect and analyse in 3D microscopy images. Attempting to use modern machine learning methods for extracting dynamic features from surface triangulations of a cell is a major challenge, because the topology is constantly changing over time. To incorporate time as a feature to inform segmentation on cell surfaces this talk presents heterogeneous graph neural networks, using a time based hypergraph of a cell timeseries. Here, we focus on identification of macropinocytic cups on the surface of Dictyostelium cells, structures involved in the uptake of extracellular fluid. The network classifies each node into belonging to a cup or not, enabling subsequent studies of the detailed distribution of molecules regulating fluid uptake in cells.

Machine Learning + AI (Problem Exploration)

Network dissection in Diffusion Model (top)

by Yu Cao

Recently, the diffusion model has become a rising class of generative models by virtue of its power-generating ability. Compared to other deep generative models, however, diffusion models are certainly a bigger black box for humans. In this work, we use network dissection, an analytical framework that systematically identifies the semantics of individual hidden units in image classification and image generation networks. First, we find evidence that the units in the VDM model are significantly more efficiently utilised than the GAN by analysing the role of the units of those. Secondly, by analysing the changes when small units are activated or deactivated, we find that although the mathematical principles of the two models is the same, in reality the two models have completely different operating mechanisms. Thus, we answer the question of why the DDPM performs extensively better than the VDM, both from a theoretical and experimental perspective.

Fault Tolerant Task Offloading in Edge-Based IoT networks, A Multi-armed Bandit Approach. (top)

by Alya Alhazmi

IoT devices have limited computing capabilities, and cannot execute computationally-intensive tasks such as deep learning. To address this issue, devices offload some tasks to powerful servers at the edge of the network, to bound communication delays. As users are mobile, they can only communicate with a varying set of edge nodes (ENs) at any given time. Hence, as IoT devices move and offload their tasks, the capacities of the available ENs vary, giving rise to a dynamic network, exacerbating the offloading problem, which can be stated as follows: Given multiple ENs with unknown capabilities and a mobility pattern, select the most appropriate EN set that guarantees minimum delay and energy consumption. We formalise this problem as a delay-energy optimization problem by jointly considering EN selection for task offloading and fault tolerance in the edge network. We develop a multi-armed bandit framework that learns under the various uncertainties in system dynamics.

Research in Federated Graph Neural Networks (top)

by Yan Qian

Federated Learning (FL) has been a hot research topic in enabling the collaborative training of machine learning models among different organizations under the privacy restrictions. As researchers try to support more machine learning models with different privacy-preserving approaches, there is a requirement in developing systems and infrastructures to ease the development of various federated learning algorithms. Federated learning systems are equivalently important and face challenges from various aspects such as effectiveness, efficiency, and privacy. Graph Neural Network (GNN) research grows rapidly that help in learning distributed representations from graph-structured data. On the other hand, centralizing a massive amount of real-world graph data for GNN training is prohibitive due to privacy concerns, regulation restrictions, and commercial competitions, and FL could help GNN to train in the distributed system that benefits dealing with the large-scale dataset, as well as the privacy concerns.

Monocular Depth Estimation (MDE) (top)

by Bashayer Abdallah

Computer vision applications, such as scene understanding and reconstruction, rely on monocular depth estimation from images. Traditional monocular depth estimation (MDE) methods, such as defocused methods, rely on depth cues for depth prediction and which unsuitable for images of a blurred texter. Deep learning (DL) techniques for MDE were recently proposed, showing great promise in dealing with the classic ill-posed problem. However, those methods continue to suffer from edge blur challenges, especially in complex object structures. The blurry edges contribute to depth estimation errors and result in flying pixels. Our initial work is to review edge detection state-of-art studies and how to implement these algorithms on our MDE baseline to comprehend the effects of such methods on capturing the fine edges in predicted depth maps.

Exploration in NAS (Neural Architecture Search) techniques (top)

by Minghao Xu

Neural Architecture Search has been widely used to automatically discover better neural network models in an ample search space of candidate network architectures. However, due to the massive number of architectures, it usually takes hundreds and thousands of GPU hours to train and test to get the most optimal structure. Recent works focus on new strategies named zero-cost proxies to remarkably reduce training time. By introducing such metrics, we can predict the expected performance of the network in several seconds, which is efficient enough for computers to search in the vast search space at a significantly lower cost. My research interests lie in designing such strategies to optimize the procedure of NAS. Not only by proposing new zero-cost proxies but by developing more intelligent model search algorithms, which conserve previously trained parameters to retain the accumulated accuracy like transfer learning.

What would Galileo think of Neural Architecture Search? (top)

by Rosco Hunter

This talk will introduce the field of Neural Architecture Search (NAS), a technique interested in automating the discovery of the best-performing neural network architectures with respect to a chosen combination of accuracy, latency, etc. The talk will cover examples of metrics that are predictive of network architecture performance; the role of NAS benchmarks in evaluating the effectiveness of different metrics and the potential of NAS to advance our understanding of why neural networks even work.

Machine Learning (Problem Exploration)

Multi-Camera Multi-Object Trajectory Forecasting (MCTF) (top)

by Amey Noolkar

The objective of MCTF is to predict the trajectory of a moving object across a network of cameras. The existing approach (Styles et al) tracks one object across multiple cameras. However, its use for surveillance applications is limited. There exist robust object-tracking and prediction methods for single cameras, however they are not easily scaled for a network of cameras. We address these shortcomings by proposing an MCTF model with a location-tensor-based framework which would represent the location of objects across multiple cameras using the position of their bounding boxes over time. The model shall follow a Which-When-Where approach and learn to predict the movement of the bounding boxes across the network of cameras in an image-agnostic way. Concretely, we want to predict in which camera(s) the bounding-boxes corresponding to the tracked objects are most likely to appear, and when and where within the camera views would they appear.

Explaining predictions of Neural Network with hidden paradigms (top)

by Bhushan Atote

Along with the enormous achievement, Artificial intelligence (AI) plays an integral part in our daily life. A major barrier to the deployment of AI-based systems, despite these tremendous breakthroughs, is that they frequently lack transparency. Thus, these systems lose people’s trust and has reignited the discussion on explainable AI (XAI). XAI has a lot of potential for enhancing the transparency and trust of AI-based systems. Strong predictions can be made using Deep Neural Networks (DNN), but they cannot be directly explained. In terms of perceptibly human-friendly representations, such as word phrases in text or super-pixels in images, concept-based explanations justify model judgements. Convolutional neural networks (CNNs), in particular, are DNNs used in computer vision. In this work, we introduced a deep neural architecture – Hidden Paradigm Network (HiPaNet): the network analyses the image to find hidden paradigms and then produces the classification result by highlighting its importance.

Machine Learning for Histopathology Images (top)

by George Wright

Pathologists manually examining slides under a microscope is considered best practice in the diagnosis of many medical problems. This is a costly and time-consuming process due to a shortage of pathologists and an increase in demand. Computational pathology aims to automate clinical workflows using machine learning algorithms to analyse histopathology whole slide images (WSIs). These algorithms are advantageous as they can quickly provide a standardised diagnosis incorporating several different data sources and can offer valuable insights such as the factors affecting prognosis. However, due to the large size of multi-gigapixel WSIs, the use of traditional image analysis algorithms is not possible due to the large memory requirements. Specialist computational pathology algorithms exist in numerous medical domains that in many cases can outperform a trained pathologist. This problem exploration talk will explore recent advances in the methods used for processing histopathology images and their possible applications.

Source Code Plagiarism with Supervised Machine Learning: Open Issues and Challenges (top)

by Fahad Ebrahim

Plagiarism in programming assignments can be referred to as: "source code plagiarism". As manual checks for plagiarism are time-consuming, the automation of this process has been initiated targeting a fair evaluation and integrity. Supervised Machine Learning (SML) has been applied to source code plagiarism as plagiarism can be treated as a binary classification problem with two outputs: 'yes' (plagiarised) or 'no' (not plagiarised). While surveying the SML approaches as part of the Master’s dissertation, there were several open issues. There is a lack of adequate academic datasets due to legal/social issues. The available reuse datasets have a severe imbalance between training and testing sets as the percentage of real plagiarism is minor compared to non-plagiarism ones. The performance of the models is based heavily on feature extraction and selection. Most of the developed models are language-specific and treat source code as a plain natural language that limits their capability.

Towards Efficient Large-Scale Federated Learning (top)

by Sara Alosaime

Data privacy concerns and strict government legislation (e.g. GDPR) have raised interest in privacy-preserving distributed training. Federated Learning (FL), proposed by Google in 2016, allows clients to synchronise locally trained models rather than their own data. Mobile Edge Computing is ideal for large-scale FL with hierarchical layers to train a generic and robust model on many clients. However, as deployments scale up, there may be unique FL challenges related to clients’ heterogeneity, lack of resources, availability and communication issues. Consequently, the number of stragglers and dropped participants will increase, preventing the FL from reaching the target convergence within the expected time and wasting clients' resources. In light of these real-world challenges, our aims to design multi-layer FL architecture for Human Activity Recognition system that balances training progress, model accuracy, and resource preservation.

Optimal Transport for Causal Abstractions (top)

by Giorgos Felekis

Complex systems can encompass a range of fields, from physical systems like particle behaviour to social systems like decision-making. The challenge with studying complex systems is their level of detail regarding the dependencies and interactions within their parts or with its environment. Structural Causal Models (SCMs) are commonly used to model these systems by representing the cause-and-effect relationships between variables and predicting the consequences of changes/interventions to them. To simplify their analysis, we create abstractions which are simplified high-level versions of them but retain the crucial information. Our goal is to connect these abstractions back to the original, more detailed models, allowing us to jointly reason about evidence across models with multiple levels of granularity. To achieve this, we introduce a novel approach that leverages Optimal Transport theory to align the probability distributions resulting from interventions on the Structural Causal Models of both the detailed and abstracted systems.