# Abstracts

**Marc Bocquet** (1,2), Pavel Sakov (3)

(1) Université Paris-Est, CEREA joint laboratory Ecole des Ponts ParisTech and EdF R&D, France.

(2) INRIA, Paris Rocquencourt research center, France.

(3) Bureau of Meteorology, Melbourne, Australia.

*Parameter estimation with the iterative ensemble Kalman smoother and application to a coupled meteorological/tracer low-order model*

Both ensemble filtering and variational data assimilation methods have proven useful in the joint estimation of state variables and parameters of geophysical models. Yet, their respective benefits and drawbacks in this task are distinct. An ensemble variational method, known as the iterative ensemble Kalman smoother (IEnKS) has recently been introduced. It is based on an adjoint model-free variational, but flow-dependent, scheme. As such, the IEnKS is a candidate tool for joint state and parameter estimation that may inherit the benefits from both the ensemble filtering and variational approaches. In this study, an augmented state IEnKS is tested on its estimation of the forcing parameter of the Lorenz-95 model. Since joint state and parameter estimation is especially useful in applications where the forcings are uncertain but nevertheless determining, typically in atmospheric chemistry, the augmented state IEnKS is tested on a new low-order model that takes its meteorological part from the Lorenz-95 model, and its chemical part from the advection diffusion of a tracer. In these experiments, the IEnKS is compared to the ensemble Kalman filter, the ensemble Kalman smoother, and a 4D-Var, which are considered the methods of choice to solve these joint estimation problems. In this low-order model context, the IEnKS is shown to significantly outperform the other methods regardless of the length of the data assimilation window, and for present time analysis as well as retrospective analysis. Besides which, the performance of the IEnKS is even more striking on parameter estimation; getting close to the same performance with 4D-Var is likely to require both a long data assimilation window and a complex modeling of the background statistics.

*Quantifying Bayesian filter performance through information theory*

I will exploit connections between the filtering problem and information theory in order to revisit the issue of filter optimality in the presence of model error, and to study the statistical accuracy of various imperfect Kalman filters for estimating the dynamics of high-dimensional, partially observed systems. The effects of model error on filter stability and accuracy in this setting are analysed through appropriate information measures which naturally extend the common path-wise estimates of filter performance, like the mean-square error or pattern correlation, to the statistical ‘superensemble’ setting. Particular emphasis is on the notion of practically achievable filter skill which requires trade-offs between different facets of filter performance; a new information criterion is introduced in this context and discussed on simple examples.

*Time-parallel algorithms for Variational Data Assimilation*

The current trend in data assimilation for Numerical Weather Prediction (NWP) is towards hybrid algorithms that attempt to combine the best features of variational and ensemble-based algorithms. It is likely, therefore, that variational methods will remain a component of NWP systems for some time to come. However, such algorithms face a challenge in view of the increasing trend towards highly parallel computers. To meet this challenge, new algorithms must be developed that provide increased opportunities for parallelisation. Of particular interest is parallelisation in the time dimension: a possibility that is not exploited by current algorithms. In this talk, I will present a parallel-in-time algorithm for four-dimensional variational data assimilation, and discuss its role within ECMWF's increasingly "hybrid" data assimilation system.

*Multiscale Filtering with Superparameterization*

Observations of a true signal from nature taken at a physical location include contributions from large and small spatial scales. Most atmosphere and ocean models fail to resolve all the active scales of the true system. These coarse models are imperfect models of the large scales, and 'parameterizations' attempt to improve the accuracy of the large-scale dynamics. The mismatch between the large-scale model and the full-scale observations is modeled in filtering frameworks by incorporating 'representation error' into the observation error, so that the observations can be viewed as observations of the large scales corrupted by both instrument error and small-scale variability.

We consider a toy model of dispersive turbulence and show how stochastic superparameterization, a particular kind of multiscale parameterization, can be used to accurately estimate representation error on the fly. We also develop an ensemble Kalman filter framework for simultaneously estimating the large- and small-scale parts of the true signal, with the significant caveat that the small-scale part is only estimated at the coarse grid points. The approach is tested in a particularly difficult scenario where the small scales account for two-thirds of the total variance at each coarse grid point, so that even highly-accurate observations contain only a minimal amount of large-scale information. The method performs well in this scenario, reliably estimating both the large-scale and small-scale parts of the true signal.

*Targeted Sampling for Gaussian-Mixture Filtering High Dimensional Systems with Small Ensembles*

The update step in Gaussian-mixture filtering consists of an ensemble of Kalman updates for each center of the mixture, generalizing the ensemble Kalman filters (EnKFs) update step to non-Gaussian distributions. Sampling the posterior distribution is required to numerically integrate the posterior distribution forward with the dynamical model, which is known as the forecast step. For computational reasons, only small samples could be considered when dealing with large scale atmospheric and oceanic models. As such a "targeted" sampling that captures some features of the posterior distribution might be a better strategy than a straightforward random sampling. This was numerically demonstrated for the Gaussian-based ensemble filters, with the deterministic EnKFs providing better performances than the stochastic EnKF in many applications. In this talk, I will present two filtering algorithms based on this idea of "targeted" sampling; the first one introduces a deterministic sampling of the observational errors in the stochastic EnKF, and the second one is based on a Gaussian-mixture update step derived from a Kernel parametrization of the forecast sample and a resampling step matching the first two moments of the posterior. Numerical results with the popular Lorenz-96 model will be presented and discussed.

*Coping with multi-scale in ensemble data assimilation*

Geophysical fluid systems exhibit a wide range of spatial and temporal scales. Observing network of geophysical fluid systems, consisting in-situ and remote-sensing, are sampled at multi-resolution. To assimilate spatially high resolution and sparse observations together, a data assimilation scheme must fulfill two specific objectives. First, the large-scale flow components must be effectively constrained using the sparse observations. Second, small-scale features that

are resolved by the high-resolution observations must be utilized to the maximum degree possible. In this talk, we present a practical, multi-scale approach to

data assimilation and demonstrate advantage of multi-scale approach over conventional data assimilation. To demonstrate the advantage of and gain

insight into the multi-scale approach, the method is applied to the multi-scale Lorenz model and the results are analyzed in the context of both ensemble and

variational methods.

*A deterministic approach to filtering and EnKF for continuous stochastic processes observed at discrete times*

Filtering of a continuous-time stochastic process which is observed at discrete observation times is considered. An approach is proposed in which an accurate numerical approximation of the Fokker-Planck equation is used to obtain the predicting density. This density is then used either to approximate (a) the true filtering density arising from Bayes rule, (b) the mean-field EnKF density, or (c) a Gaussian approximation. The local error of the EnKF is given as the sum of two components: (i) the error between the finite approximation and the mean-field limit (sample or discretization error), and (ii) the error between the mean-field limit and the true filtering distribution (linear update error). In its simplest form presented here we prove that the error (i) is asymptotically smaller using this new approach as compared to standard EnKF for model dimensions d ≤ 3, and therefore either of (a,b,c) outperform standard EnKF as long as (i) is the dominant source of error. Once error (ii) exceeds error (i) this improvement is irrelevant and standard EnKF with any sufficiently large ensemble size is comparable to (b). To confirm the analytical results relating to error (i) and to further investigate the effects of error (ii) we perform numerical experiments for both a linear and a nonlinear Langevin SDE. In the nonlinear case, we examine the effect of imposing a Gaussian approximation on the distribution and find that the approximate Gaussian filter may perform better than the non-Gaussian mean-field EnKF, depending on when the Gaussian approximation is imposed. It may be possible to use these results to develop more effective filters.

**Andy Majda**

*Data Driven Methods for Complex Turbulent Systems*

An important contemporary research topic is the development of physics constrained data driven methods for complex, large-dimensional turbulent systems such as the equations for climate change science. Three new approaches to various aspects of this topic are emphasized here: 1) the systematic development of physics constrained quadratic regression models with memory for low frequency components of complex systems; 2) Novel dynamic stochastic superresolution algorithms for real time filtering of turbulent systems; 3) New nonlinear Laplacian Spectral Analysis (NLSA) algorithms for large dimensional time series which capture both intermittency and low frequency variability unlike conventional EOF or principal component analysis. This is joint work with John Harlim (1, 2), Michal Branicki (2), and Dimitri Giannakis (3).

Examples will include dynamic stochastic superresolution (DSS) for the mesoscale eddy heat flux from coarse satellite altimetry measurements and low frequency and intermittent modes in SST form 800 year runs of CCM. All papers mentioned in the talk are available at Majda’s NYU Faculty website http://math.nyu.edu/faculty/majda/.

*Algorithms for Multiscale Filtering of Complex Turbulent Systems*

This lecture discusses recent multi-scale algorithms for filtering complex turbulent systems. The emphasis is on algorithms that blend a particle filter or ensemble adjustment filter on a lower dimensional subspace with conditional Gaussian filters on the orthogonal complement. A mathematical framework for these algorithms is developed. Applications to a recent adaptive blended particle filter are presented (joint work with DiQi) A conceptual dynamical model for turbulence is introduced and utilized as a test bed for the suite of multi-scale filtering algorithms (joint work with Yoonsang Lee).

*Accounting for correlated observation errors in image data assimilation*

Satellites images can provide a lot of information on the earth system evolution. Although those sequences are frequently used, the importance of spatial error correlation is rarely taken into account in practice. This results in discarding a huge part of the information content of satellite image sequences. In this talk, we investigate a method based on wavelet or curvelet transforms to represent (at an affordable cost) some of the observation error correlation in a data assimilation context. We address the topic of monitoring the initial state of a system through the variational assimilation of images corrupted by a spatially correlated noise. The feasibility and the reliability of the approach is demonstrated in an academic context with a 2D Shallow-Water model.

*Particle filter for high-dimensional problems: combining optimal transportation with localisation*

Particle filters or sequential Monte Carlo methods are powerful tools for adjusting model state to data. However they suffer from the curse of dimensionality and have not yet found wide-spread application in the context of spatio-temporal evolution models. On the other hand, the ensemble Kalman filter with its simple Gaussian approximation has successfully been applied to such models using the concept of localization. Localization allows one to account for a spatial decay of correlation in a filter algorithm. In my talk, I will propose novel particle filter implementations which are suitable for localization and, as the ensemble Kalman filter, fit into the broad class of linear transform filters. In case of a particle filter this transformation will be determined by ideas from optimal transportation while in case of the ensemble Kalman filter one essentially relies on the linear Kalman update formulas. This common framework also allows for a mixture of particle and ensemble Kalman filters. Numerical results will be provided for the Lorenz-96 model which is a crude model for nonlinear advection.