List of Talks

Sebastian Lerch (Karlsruhe Institute of Technology, Karlsruhe, Germany)

Title: Generative machine learning methods for multivariate ensemble post-processing

Abstract: Ensemble weather forecasts based on multiple runs of numerical weather prediction models typically show systematic errors and require post-processing to obtain reliable forecasts. Accurately modeling multivariate dependencies is crucial in many practical applications, and various approaches to multivariate post-processing have been proposed where ensemble predictions are first post-processed separately in each margin and multivariate dependencies are then restored via copulas. These two-step methods share common key limitations, in particular the difficulty to include additional predictors in modeling the dependencies. We propose a novel multivariate post-processing method based on generative machine learning to address these challenges. In this new class of nonparametric data-driven distributional regression models, samples from the multivariate forecast distribution are directly obtained as output of a generative neural network. The generative model is trained by optimizing a proper scoring rule which measures the discrepancy between the generated and observed data, conditional on exogenous input variables. Our method does not require parametric assumptions on univariate distributions or multivariate dependencies and allows for incorporating arbitrary predictors. In two case studies on multivariate temperature and wind speed forecasting at weather stations over Germany, our generative model shows significant improvements over state-of-the-art methods and particularly improves the representation of spatial dependencies.

Bobby Antonio (University of Bristol)

Title: Post-processing East African precipitation forecasts using a generative machine learning model

Abstract: Existing weather models are known to have poor skill over Africa, where there are regular threats of drought and floods that present significant risks to people's lives and livelihoods. Improved precipitation forecasts could help mitigate the negative effects of these extreme weather events, as well as providing significant financial benefits to the region. Building on work that successfully applied a state-of-the-art machine learning method (a conditional Generative Adversarial Network, cGAN) to postprocess precipitation forecasts in the UK, we present a novel way to improve precipitation forecasts in East Africa. We address the challenge of realistically representing tropical convective rainfall in this region, which is poorly simulated in conventional forecast models. We use a cGAN to postprocess ECMWF high resolution forecasts at 0.1 degree resolution and 6-18h lead times, using the IMERG dataset as ground truth, and investigate how well this model can correct bias, produce reliable probability distributions and create samples of rainfall with realistic spatial structure. This has the potential to enable cost effective improvements to early warning systems in the affected areas.

Alban Farchi (Ecole des Ponts ParisTech, France)

Title: Online model error correction with neural networks - from theory to the ECMWF forecasting system

Abstract: Recent studies have shown that it is possible to combine machine learning (ML) with data assimilation (DA) to reconstruct the dynamics of a system that is partially and imperfectly observed. This approach takes advantage of the strengths of both methods. DA is used to estimate the system state from the observations, while ML computes a surrogate model of the dynamical system based on those estimated states. The surrogate model can be defined as an hybrid combination where a physical part based on prior knowledge is enhanced with a statistical part estimated by a neural network. The training of the neural network is usually done offline, once a large enough dataset of model state estimates is available.

Online learning has been investigated recently. In this case, the surrogate model is improved each time a new system state estimate is computed. Although online approaches still require a large dataset to achieve a good performance, they naturally fit the sequential framework in geosciences where new observations become available over time.

Going even further, we propose to merge the DA and ML steps. This is technically achieved by estimating, at the same time, the system state and the surrogate model parameters. This new method has been applied to a two-scale Lorenz system and a two-layer two-dimensional quasi geostrophic model. Preliminary results shows the potential of incorporating DA and ML tightly, and pave the way towards its application to the Integrated Forecast System (IFS) used for operational Numerical Weather Prediction at ECMWF.

Matt Graham (University College London)

Title: ParticleDA.jl: distributed data assimilation with particle filters

Abstract: Data assimilation (DA) methods combine prior knowledge about a physical system, formulated as a probabilistic model, with data corresponding to observations of the system over time, to estimate the evolution of the state of the system. Importantly the estimates produced need to reflect both the uncertainty arising from having only partial and noisy observations, and incomplete knowledge of the model dynamics. Particle filters are a particularly appealing approach for carrying out DA as they do not make strong Gaussian assumptions about the distribution of the states. However, naïve implementations require simulating very large ensembles which can make them computationally expensive to apply when used with complex high-dimensional models. ParticleDA.jl is an open-source package for performing particle filter based data assimilation on distributed computing systems developed at UCL's Advanced Research Computing Centre. ParticleDA.jl is written in Julia, which allows for a high-level interface that is easy to use and extend while remaining highly performant. To allow scaling to high-dimensional models the package takes a two-pronged approach, implementing statistically efficient particle filtering algorithms which reduce the ensemble sizes needed, while allowing large ensembles to be efficiently simulated in parallel on high performance computing systems.

Dan Crisan (Imperial College)

Title: Calibration of stochastic parametrizations for geophysical fluid dynamic models.

Abstract: Stochastic parametrizations have been used increasingly in numerical weather and climate modelling. They have a great potential to increase the predictive capability of next-generation weather and climate models. For a judicious usage in modelling fluid evolution, one needs to calibrate the parametrizations to data. In this talk, I will address this requirement for the rotating shallow water RSW model with a novel stochastic parametrization that ensures the physical properties of the deterministic model are preserved and is compatible with model reduction techniques. The method is generic and can be applied to arbitrary stochastic parametrizations. It is also agnostic as to the source of data (real or synthetic). It is based on a principal component analysis technique to generate the eigenvectors and the eigenvalues of the covariance matrix of the stochastic parametrization. For the stochastic RSW model covered in this paper, we calibrate the noise by using the elevation variable of the model, as this is an observable easily obtainable in practical application, and use synthetic data as input for the calibration procedure. The talk is based on the paper: “D Crisan, O Lang, A Lobbe, PJ van Leeuwen, R Potthast, Noise calibration for the stochastic rotating shallow water model, arXiv preprint arXiv:2305.03548”

Ryuichi Kanai (University College London)

Title: Functional History Matching: A new method and its application.

Abstract: It is clear that computer simulations capable of reproducing observed data are desirable. However, it is extremely rare to know the appropriate values of input parameters before running a computer simulation. Additionally, in many cases, large-scale simulations may take a very long time, ranging from one week to several months for a single computation, making it impractical to estimate desirable input parameter values by performing numerous simulations with different input parameter values. One approach to solving this problem is a method called History Matching. With this method, it is possible to estimate a plausible range of input parameter values by comparing the results of a limited number of simulations using different input parameter values with observed data. This method was first proposed by Craig et al. [1]. It was used to estimate parameter values for an oil well model that matches the observed data. Since then, the method has been applied to many examples, such as galaxy formation, transmission of HIV, climate, and so on. However, the History Matching method is applicable only to discrete data, and is not applicable to continuous data (functional data). Therefore, we developed a new method called Functional History Matching that extends the method to apply to continuous data for the first time. In this method, we consider functions in the Sobolev space and use the Sobolev norm to compare the observed data with simulation results, taking into account the shape of functions that cannot be evaluated by simple norms such as the L2 norm. In addition, the Outer Product Emulator [2] was used to emulate functional data. As a result, it becomes possible to estimate the plausible region of input parameter values for functional data simulation.

[1] P. Craig, M. Goldstein, A. Seheult, and J. Smith. "Bayes Linear strategies for matching hydrocarbon reservoir history". Bayesian statistics, 5:69–95 (1996).

[2] J. Rougier. "Efficient emulators for multivariate deterministic functions." Journal of Computational and Graphical Statistics 17.4:827-843 (2008).

Philippe Naveau (LSCE, France)

Title: A variational auto-encoder approach to sample multivariate extremes

Abstract: (joint work with Nicolas Lafon (LSCE) and Ronan Fablet (IMT-Atlantique).) Rapidly generating accurate extremes from an observational dataset is crucial when seeking to estimate risks associated with the occurrence of future extremes which could be larger than those already observed. Many applications ranging from the occurrence of natural disasters to financial crashes are involved. This paper details a variational auto-encoder (VAE) approach for sampling multivariate extremes. The proposed architecture is based on the extreme value theory (EVT) and more particularly on the notion of multivariate functions with regular variations. Experiments conducted on synthetic datasets as well as on a dataset of discharge measurements along Danube river network illustrate the relevance of our approach.

Massimiliano Tamborrino (University of Warwick)

Title: Guided sequential ABC schemes for simulation-based inference

Abstract: Sequential algorithms such as sequential importance sampling (SIS) and sequential Monte Carlo (SMC) have proven fundamental in Bayesian inference for models not admitting a readily available likelihood function. For approximate Bayesian computation (ABC), SMC-ABC is the state-of-art sampler. However, since the ABC paradigm is intrinsically wasteful, sequential ABC schemes can benefit from well-targeted proposal samplers that efficiently avoid improbable parameter regions. We contribute to the ABC modeller's toolbox with novel proposal samplers that are conditional to summary statistics of the data. By doing this, the proposed parameters are "guided" to rapidly reach regions of the posterior surface that are compatible with the observed data. This speeds up the convergence of these sequential samplers, reducing the computational effort while preserving the accuracy in the inference. We provide a variety of Gaussian and copula-based guided samplers for both SIS-ABC and SMC-ABC, showing how they easy inference for challenging case-studies.

Maud Lemercier (University of Oxford)

Title: Neural Stochastic PDEs: Resolution-Invariant Learning of Continuous Spatiotemporal Dynamics

Abstract: Stochastic partial differential equations (SPDEs) are the mathematical tool of choice for modelling spatiotemporal PDE-dynamics under the influence of randomness. Based on the notion of mild solution of an SPDE, we introduce a novel neural architecture to learn solution operators of PDEs with (possibly stochastic) forcing from partially observed data. The proposed Neural SPDE model provides an extension to two popular classes of physics-inspired architectures. On the one hand, it extends Neural CDEs and variants -- continuous-time analogues of RNNs -- in that it is capable of processing incoming sequential information arriving at arbitrary spatial resolutions. On the other hand, it extends Neural Operators -- generalizations of neural networks to model mappings between spaces of functions -- in that it can parameterize solution operators of SPDEs depending simultaneously on the initial condition and a realization of the driving noise. By performing operations in the spectral domain, we show how a Neural SPDE can be evaluated in two ways, either by calling an ODE solver (emulating a spectral Galerkin scheme), or by solving a fixed point problem. Experiments on various semilinear SPDEs, including the stochastic Navier-Stokes equations, demonstrate how the Neural SPDE model is capable of learning complex spatiotemporal dynamics in a resolution-invariant way, with better accuracy and lighter training data requirements compared to alternative models, and up to 3 orders of magnitude faster than traditional solvers.

Sigurd Assing (University of Warwick)

Title: One way to turn the primitive equations into stochastic dynamical systems for climate modelling

Abstract: I am going to motivate an established approach to climate modelling. The first step gives a rather complicated system of equations for so-called resolved and unresolved variables, and the second step is about simplifying a scaled version of this system of equations. The second step is based on some ad hoc assumptions and stochastic model reduction. I`ll tell what we (joint work with Flandoli and Pappalettera) were able to do about this, but I’ll also raise awareness of what we could not do. Some of this might be of interest to statisticians, too, as knowing the modelling mechanism would shed light on how data should be mapped to parameters and coefficients of climate models.

Lorenzo Pacchiardi (University of Oxford)

Title: Probabilistic Forecasting with Generative Networks via Scoring Rule Minimization Abstract: Probabilistic forecasting relies on past observations to provide a probability distribution for a future outcome, which is often evaluated against the realisation using a scoring rule. Here, we perform probabilistic forecasting with generative neural networks, which parametrize distributions on high-dimensional spaces by transforming draws from a latent variable. Generative networks are typically trained in an adversarial framework. In contrast, we propose to train generative networks to minimise a predictive-sequential (or prequential ) scoring rule on a recorded temporal sequence of the phenomenon of interest, which is appealing as it corresponds to the way forecasting systems are routinely evaluated. Adversarial-free minimization is possible for some scoring rules; hence, our framework avoids the cumbersome hyperparameter tuning and uncertainty underestimation due to unstable adversarial training, thus unlocking reliable use of generative networks in probabilistic forecasting. Further, we prove consistency of the minimizer of our objective with dependent data, while adversarial training assumes independence. We perform simulation studies on two chaotic dynamical models and a benchmark dataset of global weather observations; for this last example, we define scoring rules for spatial data by drawing from the relevant literature. Our method outperforms state-of-the-art adversarial approaches, especially in probabilistic calibration, while requiring less hyperparameter tuning.

Marvin Pförtner (University of Tübingen)

Title: Physics-Informed Gaussian Process Regression Generalizes Linear PDE Solvers

Abstract: Linear partial differential equations (PDEs) are an important, widely applied class of mechanistic models, describing physical processes such as heat transfer, electromagnetism, and wave propagation. In practice, specialized numerical methods based on discretization are used to solve PDEs. They generally use an estimate of the unknown model parameters and, if available, physical measurements for initialization. Such solvers are often embedded into larger scientific models with a downstream application and thus error quantification plays a key role. However, by ignoring parameter and measurement uncertainty, classical PDE solvers may fail to produce consistent estimates of their inherent approximation error. In this talk, I will show that solving linear PDEs can be interpreted as physics-informed Gaussian process (GP) regression. Crucially, this probabilistic viewpoint allows to (1) quantify the inherent discretization error; (2) propagate uncertainty about the model parameters to the solution; and (3) condition on noisy measurements. Demonstrating the strength of this formulation, I will show that it strictly generalizes methods of weighted residuals, a central class of PDE solvers including collocation, finite volume, pseudospectral, and (generalized) Galerkin methods such as finite element and spectral methods. This class can thus be directly equipped with a structured error estimate. Finally, I will present the theoretical backbone of the approach: a key generalization of the Gaussian process inference theorem to observations made via an arbitrary bounded linear operator.

Matthew Willson (Deep Mind)

Title: GraphCast: Learning skillful medium-range global weather forecasting

Abstract: We present our recent work on GraphCast (https://arxiv.org/abs/2212.12794), a machine-learning based weather simulator. GraphCast is based on graph neural networks, and was trained from 40+ years of reanalysis data to forecast various surface and atmospheric variables over 10 days, at 0.25deg resolution. It can generate a forecast in 60 seconds, and outperforms ECMWF's deterministic operational forecasting system, HRES, on around 90% of 2760 target variables. These results represent a key step forward in complementing and improving weather modeling with ML, opening new opportunities for fast, accurate forecasting. In this talk we will present an updated and detailed evaluation of GraphCast against HRES, discussing some of the difficulties involved, and some directions for the future.

James Briant (University College London)

Title: Machine Learning and Climate Model Fusion: Embedding High Resolution Variability into a Coarse Resolution Climate Simulation

Abstract: Underrepresentation of cloud formation is a known failing in current Climate simulations. This is due to the coarse grid resolution which is required due to the computational constraint of integrating over long time scales but does not permit the underlying cloud generating physical processes. This work employs a multi-output Gaussian Process (MOGP) trained on high resolution Unified Model (UM) runs and predicts the variability of temperature and specific humidity fields. A proof of concept study has been carried out where a trained MOGP model is coupled in-situ with a simplified Atmospheric General Circulation Model (AGCM) named SPEEDY. The temperature and specific humidity profiles of the SPEEDY model outputs are perturbed at each timestep according to the predicted high resolution informed variability. 10-year forecasts are generated for both default SPEEDY and fused SPEEDY models and output fields compared ensuring fused predictions remain representative of Earth's atmosphere. Some changes in the precipitation, outgoing longwave and shortwave radiation patterns are observed indicating modelling improvements in the complex region surrounding India and the Indian sea.

Johanna Ziegel (University of Bern)

Title: Easy Uncertainty Quantification (EasyUQ): Generating predictive distributions from single–valued model output

Abstract: How can we quantify uncertainty if our favourite computational tool – be it a nu- merical, a statistical, or a machine learning approach, or just any computer model –provides single-valued output only? We introduce the Easy Uncertainty Quantification (EasyUQ) technique, which transforms real-valued model output into calibrated statistical distributions, based solely on training data of model output-outcome pairs, without any need to access model input. In its basic form, EasyUQ is a special case of the recently introduced Isotonic Distributional Regression (IDR) technique. EasyUQ yields discrete predictive distributions that are calibrated and optimal in finite samples, subject to stochastic monotonicity. The workflow is fully automated, without any need for tuning. The Smooth EasyUQ approach supplements IDR with kernel smoothing, to yield continuous predictive distributions that preserve key properties of the basic form, including both, stochastic monotonicity with respect to the original model output, and asymptotic consistency. We use simulation examples and the WeatherBench challenge in data-driven weather prediction to illustrate the techniques. In a study of benchmark problems from machine learning, we show how EasyUQ and Smooth EasyUQ can be integrated into the workflow of modern neural network learning and hyperparameter tuning, and find EasyUQ to be competitive with more elaborate input-based approaches. (Joint work with: Eva-Maria Walz, Alexander Henzi, Tilmann Gneiting)

Peter Watson (University of Bristol)

Title: Machine learning applications for weather and climate need greater focus on extremes

Abstract: Multiple studies have now demonstrated that machine learning (ML) can give improved skill for simulating fairly typical weather events in climate simulations, for tasks such as downscaling to higher resolution and emulating and speeding up expensive model parameterisations. Many of these used ML methods with very high numbers of parameters, such as neural networks, which are the focus of the discussion here. Not much attention has been given to the performance of these methods for extreme event severities of relevance for many critical weather and climate prediction applications, with return periods of more than a few years. This leaves a lot of uncertainty about the usefulness of these methods, particularly for general purpose models that must perform reliably in extreme situations. ML models may be expected to struggle to predict extremes due to there usually being few samples of such events.

This presentation will review the small number of studies that have examined the skill of machine learning methods in extreme weather situations. It will be shown using recent results that machine learning methods that perform reasonably for typical weather events can have very large errors in extreme situations, highlighting the necessity of testing the performance for these cases. Extrapolation to extremes is found to work well in some studies, however.

It will be argued that more attention needs to be given to performance for extremes in work applying ML in climate science. Research gaps that seem particularly important are identified. These include investigating the behaviour of ML systems in events that are multiple standard deviations beyond observed records, which have occurred in the past, and evaluating performance of complex generative models in extreme events. Approaches to address these problems will be discussed.

Frank Kwasniok (University of Exeter)

Title: Data-driven deterministic and stochastic subgrid-scale parameterisation in atmosphere and ocean models

Abstract: Atmospheric, oceanic and climate dynamics encompass a huge range of spatial and temporal scales without any clear scale separation. Due to limited computational resources there are inevitably unresolved scales and processes in atmosphere and ocean models which need to be accounted for by so called parameterisation schemes. We here discuss data-driven hybrid modelling, that is, a given underresolved physics-based model is augmented with data-based, machine-learning-style elements in order to capture the effect of the unresolved scales and processes on the resolved variables. A pattern-based approach is introduced: Pairs of patterns in the space of possible predictors and the space of the subgrid forcing are identified in a predictive manner to form a deterministic subgrid modelling scheme which may then be further augmented with stochastic terms. The method is a novel nonlinear and non-Markovian extension of the linear pattern techniques canonical correlation and principal prediction pattern analysis. Unlike black-box approaches such as artificial neural networks, the present technique allows for a physical interpretation of the subgrid model. The methodology is exemplified and explored on the Lorenz 1996 multiscale system and an intermediate-complexity atmospheric model with realistic mean state and variability.

Fiona Turner (King's College London)

Title: Emulating ice loss: building probabilistic projections of sea level rise with Gaussian process emulation

Abstract: Changes in the world’s land ice are projected to be the largest contributor to future global sea level rise over the next few centuries, with the Antarctic ice sheet being the most uncertain mass change (IPCC, 2021); statistical methods are required to quantify model uncertainties and estimate more robust projections. We present here projections of the Antarctic contribution to sea level rise up to the year 2300. Building on the work of Edwards et al. (2021), we design ensembles for the ice sheet models fETISh and PISM under two Shared Socioeconomic Pathways (SSPs), perturbing model settings and using multiple global climate models as forcing. Rather than building an emulator of an ensemble of ice sheet models, as in that study, we build an emulator for each individual ice sheet model to better understand the biases and internal variability of each model. We also incorporate the methods described in Rougier (2008) and Rougier et al. (2009), transforming the multi-centennial output of the models to allow us to emulate all years together, rather than one-at-a-time as was previously done. We predict changes for different SSPs to estimate how emissions scenarios will affect the probability of different Antarctic ice sheet contributions to sea level rise, and demonstrate the differing sensitivity to inputs and forcings of the ensemble of models used.

References

Edwards, T. L., Nowicki, S., Marzeion, B., Hock, R., Goelzer, H., Seroussi, H., . . . others (2021). Projected land ice contributions to twenty-first-century sea level rise. Nature, 593(7857), 74–82.

IPCC. (2021). Climate Change 2021: The Physical Science Basis. In V. Masson- Delmotte et al. (Eds.), Contribution of Working Group I to the Sixth Assess- ment Report of the Intergovernmental Panel on Climate Change. Cambridge University Press.

Rougier, J. (2008). Efficient emulators for multivariate deterministic functions. Journal of Computational and Graphical Statistics, 17(4):827–843.

Rougier, J., Guillas, S., Maute, A., and Richmond, A. D. (2009). Expert knowledge and multivariate emulation: The thermosphere–ionosphere electrodynamics general circula- tion model (tie-gcm). Technometrics, 51(4):414–424.

Francois-Xavier Briol (University College London)

Title: TBD

Abstract: TBD

Andrew Kirby (University of Oxford)

Title: Data-driven modelling of turbine wake interactions and flow resistance in large wind farms

Abstract: Turbine wake and local blockage effects are known to alter wind farm power production in two different ways: (1) by changing the wind speed locally in front of each turbine; and (2) by changing the overall flow resistance in the farm and thus the so-called farm blockage effect. To better predict these effects with low computational costs, we develop data-driven emulators of the `local' or `internal' turbine thrust coefficient CT* as a function of turbine layout. We train the model using a multi-fidelity Gaussian Process (GP) regression with a combination of low (engineering wake model) and high-fidelity (Large-Eddy Simulations) simulations of farms with different layouts and wind directions. A large set of low-fidelity data speeds up the learning process and the high-fidelity data ensures a high accuracy. The trained multi-fidelity GP model is shown to give more accurate predictions of CT* compared to a standard (single-fidelity) GP regression applied only to a limited set of high-fidelity data. We also use the multi-fidelity GP model of CT* with the two-scale momentum theory (Nishino & Dunstan 2020, J. Fluid Mech. 894, A2) to demonstrate that the model can be used to give fast and accurate predictions of large wind farm performance under various mesoscale atmospheric conditions. This new approach could be beneficial for improving annual energy production (AEP) calculations and farm optimisation in the future.