From decision-making to communication: Data visualisation during a global crisis by Professor Cagatay Turkay
About: Data visualisations play various key roles in the increasingly data-intensive practices in science, industry, governance and public communication and discourse. These roles have become even more prominent within the epidemiology and public health research, and the public debate around these during the ongoing pandemic.
In this talk, Professor Turkay will share recent research results on developing interactive data visualisations to facilitate an engaged epidemiological model building process to inform public health policy-making, as well as empirical research that study the extent that visualisations support the understanding of complex models of disease dynamics. Through these examples, the talk will aim to highlight visualisation both as a human-centred data analysis methodology in an increasingly automated landscape, and as a lens to study interactions between people and data.
Translating engineering knowledge into process models for the steel industry by Professor Michael Auinger
About: Combining engineering knowledge with data-driven simulation is a key challenge for future industries. Especially for the steel industry, which needs to consider a multitude of processes along the entire manufacturing chain, this becomes a particular challenge.
Over the years, vast amounts of data have been collected from individual process steps, which have led to the development of useful prediction tools. Whilst these models are optimised to control the process for their respective application, they do lack in cross-compatibility along the wider steel production route. Furthermore, optimisation of industrial processes has often been driven by human experience.
The push towards being economically efficient, has led to an absence of thoroughly quantifying all possible influences in a specific model for process control. This bears the danger of reaching a local maximum in performance and not being able to further optimise a single process. Considering the challenges of multiple processes working together, it may also be easily understood that the optimum of steel production is not the same as the optimum of all individual processes. Some processes need to reach an optimum with respect to throughput whilst others need to be optimised in terms of energy use, resilience in fuel source, running costs or losses due to corrosion. This talk will therefore introduce digital approaches which are developed in the UK-wide EPSRC steel partnership SUSTAIN as well as in collaboration with partners from the national and international steel sector.
The genealogy of a sequential Monte Carlo algorithm by Dr Paul Jenkins
About: In this talk, Dr Jenkins will summarise recent work which is part of the Turing Data-centric Engineering project “Sequential sampling methods for difficult problems”. Sequential Monte Carlo is a simulation-based approach to the approximation of a sequence of measures by an evolving particle system, with diverse applications such as target tracking, neuroimaging, and autonomous navigation.
Underlying the algorithm is a 'genealogy' relating the particles whose distribution can tell us about the algorithm's efficiency. We derive the asymptotic distribution of the genealogy as the size of the particle system grows.
From decision-making to communication: Data visualisation during a global crisis by Professor Graham Cormode
About: In the federated setting, data is held by distributed entities, who wish to collaborate to perform some data analysis, while ensuring privacy.
In this talk, Graham will discuss some recent results on gathering private histograms, and their application for evaluating machine learning models.
Involving service users and carers in identifying schizophrenia and dementia stigma on Twitter by Dr Sagar Jilka
Stigma has negative effects on people with mental health problems by making them less likely to seek help. We develop a proof of principle service user supervised machine learning pipeline to identify stigmatising tweets reliably and understand the prevalence of public schizophrenia stigma on Twitter.
A service user group advised on the machine learning model evaluation metric (fewest false negatives) and features for machine learning. We collected 13,313 public tweets on schizophrenia between January and May 2018. Two service user researchers manually identified stigma in 746 English tweets; 80% were used to train eight models, and 20% for testing. The two models with fewest false negatives were compared in two service user validation exercises, and the best model used to classify all extracted public English tweets.
Tweets classed as stigmatising by service users were more negative in sentiment (t (744) = 12.02, p < 0.001 [95% CI: 0.196–0.273]). Our linear Support Vector Machine was the best performing model with fewest false negatives and higher service user validation. This model identified public stigma in 47% of English tweets (n5,676) which were more negative in sentiment (t (12,143) = 64.38, p < 0.001 [95% CI: 0.29–0.31]). Machine learning can identify stigmatising tweets at large scale, with service user involvement – who provide a novel perspective to minimising issues such as bias. Given the prevalence of stigma, there is an urgent need for education and online campaigns to reduce it. Machine learning can provide a real time metric on their success.
Modelling the spread of SARS-CoV-2 in the UK by Professor Mike Tildesley
About: In this presentation, Mike will give an overview of the models developed by the University of Warwick during the COVID-19 pandemic and how they have been utilised to inform government decision making.
He will discuss how these models are constructed, what data are used to parameterise the models and how the models are used for forecasting. He will present outputs from the models developed by Warwick with a particular focus upon the impact of lockdown policies, optimal strategies for vaccine roll out dependent upon efficacy and uptake across the population, the potential for spread in educational settings and our forecasts of epidemic spread in winter 2021/22.