# Abstracts

##### Saturday, 29 June

*9:50-10:55 Jim Berger (Duke U.) Encounters with Imprecise Probabilities [discussant: Chris Holmes, Oxford U.)]*

Abstract: There is a Society of Imprecise Probability (http://www.sipta.org/). At a recent annual meeting of the society, I gave this talk to illustrate some of the methods Bayesians use to deal with imprecise probability. The illustrations considered include dealing with interval valued probabilities, the p-value problem, optimal normal hierarchical Bayesian analysis, and uncertainty quantification of complex computer models.

*11:20-12:25 Natalia Bochkina (U. Edinburgh) Semiparametric nonregular Bernstein-von Mises theorem for a mixture prior [discussant: Roberto Casarin (U. Venezia)]*

We consider the problem of density estimation where the density has unknown lower support point, under local asymptotic exponentiality, in a fully Bayesian setting. To obtain the local concentration result for the marginal posterior of the lower support (Bernstein - von Mises type theorem), we give a set of conditions on the joint prior, that ensure that the marginal posterior distribution of the lower support point of the density has shifted exponential distribution in the limit, as in the parametric case with known density (Ibragimov and Has'misnkij, 1981). In particular, we need a prior for a differentiable decreasing density with known lower support point with the following properties: a) posterior distribution of the density concentrates at the minimax rate over local Holder classes, up to a log factor, in L1 norm, b) the density is pointwise consistent uniformly in a neighbourhood of the lower support point, a posteriori. Also, we have a condition on the interaction term and a mild condition on the prior of the lower support point. The general conditions for the BvM type result we have do not coincide with those by Kleijn and Knapik (2013); the latter don't hold for the hierarchical mixture prior we consider.

To ensure that the density is a posteriori asymptotically consistent pointwise in a neighbourhood of the lower support point, as well as concentrating in L1 norm, we consider a non-homogeneous Completely Random Measure mixture, instead of a more commonly used Dirichlet mixture. We constructed an MCMC sampler for this prior, and its performance is illustrated on simulated data and applied to model distribution of bids in procurement auctions.

*13:55-15:50 Havard Rue [KAUST] Penalized Complexity Priors: 4 years later*

Four years ago at the OBayes meeting in Valencia, I presented our work on Penalized Complexity priors (PC-priors). In this talk, I will review the developments since then and also summarize our own experience using PC-priors.

Stephen Walker (U. Texas Austin] Classes of Objective priors from minimizing information functions [discussant: Jean-Michel Marin (U. Montpellier)]

*16:10-17:15 Sara Wade (U. Edinburgh) Bayesian cluster analysis [discussant: Clara Gazian (U. Oxford)]*

Abstract: Clustering is widely studied in statistics and machine learning, with applications in a variety of fields. As opposed to popular algorithms such as agglomerative hierarchical clustering or k-means which return a single clustering solution, Bayesian methods provide a posterior over the space of partitions, allowing one to assess statistical properties, such as uncertainty on the number of clusters. However, an important problem is how to summarize the posterior; the huge dimension of partition space and difficulties in visualizing it add to this problem. In this work, we develop appropriate point estimates and credible sets to summarize the posterior of the clustering structure based on decision and information theoretic techniques. Moreover, empirical results on simulation studies show that the point estimate successfully recovers the true number of clusters, with honest credible balls, contrary to recent results on Bayesian nonparametric models.

##### Sunday, 30 June

*9:00-10:05 Peter Mueller (U. Texas Austin) BNP for semi-competing risks [discussant: Isadora Antoniano-Villalobos (U. Venezia)]*

We develop a Bayesian nonparametric (BNP) approach to evaluate the effect of treatment in a random- ized trial where a nonterminal event may be censored by a terminal event, but not vice versa (i.e., semi-competing risks). Based on the idea of principal stratification, we define a novel estimand for the causal effect of treatment on the non-terminal event. We introduce identification assumptions, indexed by a sensitivity parameter, and show how to draw inference using our BNP approach. We conduct a simulation study and illustrate our methodology using data from a brain cancer trial.

paper:

A Bayesian Nonparametric Approach for Evaluating the Effect of Treatment in Randomized Trials with Semi-Competing Risks Yanxun Xu, Daniel Scharfstein, Peter Müller, Michael Daniels

https://arxiv.org/abs/1903.08509

*10:30-12:25 Gonzalo Garcia-Donato (U. Castila-La Mancha) Variable selection priors for survival models with censored data*

Abstract: We consider the variable selection problem when the response is subject to censoring. A main particularity of this context is that information content of sampled units varies depending on the censoring times. We approach the problem from an objective Bayesian perspective where the choice of prior distributions is a delicate issue given the well-known sensitivity of Bayes factors to these prior inputs. We show that borrowing priors from the `uncensored' literature may lead to unsatisfactory results as this default procedure implicitly assumes a uniform contribution of all units independently on their censoring times. In this paper, we develop specific methodology based on a generalization of the conventional priors (also called hyper-g priors) explicitly addressing the particularities of survival problems arguing that it behaves comparatively better than standard approaches on the basis of arguments specific to variable selection problems (like e.g.

predictive matching). Although the main focus of this work is on foundations of model selection under censoring, for illustrative purposes we apply the methodology on a classic transplant dataset and to a recent large epidemiological study about breast cancer survival rates in Castellon, a province of Spain.

*Stéphanie van der Pas (Leiden U.) Uncertainty quantification for survival analysis [discussant: Leo Held (U. Zürich)]*

Abstract: Joint work with Ismael Castillo (Sorbonne University, Paris, France). The Bayesian framework offers an intuitive approach towards uncertainty quantification. We discuss our recent theoretical advances towards uncertainty quantification for several survival objects within the Bayesian paradigm. We prove posterior convergence for the hazard function in supremum norm at the minimax rate. As an intermediate step, we obtain Bernstein-von-Mises theorems for linear functionals of the hazard. We then adopt a multiscale approach to prove Bernstein-von-Mises theorems for the cumulative hazard and the survival function, leading to credible bands that can be considered confidence bands. Our approach is general and does not rely on conjugacy. We demonstrate the usefulness of the proof technique on several variants of the piecewise exponential model.

*13:55-15:50 Ryan Martin (North Carolina State U.) Non-standard posterior distributions and Erlis Ruli (U. Padova) Posterior Distributions with Implicit Priors [discussant: Dongchu Sun (U. Missouri)]*

Traditional objective Bayes focuses on choosing priors such that the usual posterior distribution -- likelihood times prior -- achieves certain properties, such as valid uncertainty quantification (i.e., posterior credible sets achieve the nominal coverage probability, at least approximately). But likelihood-times-prior is not the only possible construction; greater flexibility, simpler computation, and even improved performance are possible by considering alternative constructions of data-dependent measures. In this talk I will consider a few such constructions, namely, posteriors based on data-dependent priors, "model-less" Gibbs posteriors, and a new approach that leverages dependence on data ordering, focusing on their uncertainty quantification properties.

The parametric bias-correction framework (Firth, 1993; Kosmidis, Kenne Pagui and Sartori, 2019) provides useful methods for building mean-unbiased or median-unbiased maximum likelihood estimates for model parameters. Such estimates are typically obtained by solving score equations, penalised by suitable functions of the parameters.

From a Bayesian perspective, these estimates can be interpreted as maximum a posteriori estimates when the - typically improper - prior has first derivative of its logarithm equal to the aforementioned penalisation functions. Thus, the parametric bias-correction framework, used within the Bayesian setting, supplies model-dependent priors which have the property of delivering unbiased Bayesian estimates. Unfortunately, apart for special cases, such priors are defined only implicitly, in the sense that only the derivative of the log-prior is known in closed form. Hence it is not straightforward to obtain the corresponding posterior distribution.

We introduce two methods useful to compute the posterior distribution when only the first derivative of the log-prior is available. The first method that we explore uses a variant of the Rao score test and the second method is based on Taylor expansions of the log-posterior density. Pros and cons of both methods are discussed by means of various examples of practical interest. Finally, the proposed method can be used more generally with implicitly defined matching priors or with posterior distributions based on smooth estimating equations.

*16:15-17:20 Peter Orbanz (Columbia U.) Convergence results (asymptotic normality etc) can be deduced from exchangeability properties [discussant Daniel Roy (U. Toronto)]*

##### Monday, 1 July

*9:00-10:05 Nancy Reid (U. Toronto) Interface between Bayes and frequentism [discussant: Peter Grünwald (CWI Amsterdam)]*

*10:30-12:30 Gianluca Baio (University College) Value of information and Peter Hoff (Duke U.) Bayes-optimal frequentist intervals and tests [discussant: Elias Moreno (U. Granada)]*

Bayesian statistical criteria evaluate the performance of statistical procedures on-average across parameter values, whereas frequentist criteria evaluate performance on-average across datasets. Combining these two criteria motivates the development of procedures that have frequentist statistical guarantees, but are Bayes-optimal in some sense.

We refer to such procedures as "frequentist, assisted by Bayes", or FAB.

In this talk I introduce FAB p-values and confidence intervals for some standard problems, and discuss adaptive FAB procedures for multiparameter inference.

*13:30-15:50 Pierre Latouche (U. Paris Descartes) Dimension selection on graphs and Jaeyong Lee (Seoul National U.) Post-Processed Posteriors for Band Structured Covariances [discussant: Guido Consonni (U. Cattolica Milano)]*

Sparse versions of principal component analysis (PCA) have imposed themselves as simple, yet powerful ways of selecting relevant features of high-dimensional data in an unsupervised manner. However, when several sparse principal components are computed, the interpretation of the selected variables may be difficult since each axis has its own sparsity pattern and has to be interpreted separately. To overcome this drawback, we propose a Bayesian procedure that allows to obtain several sparse components with the same sparsity pattern. This allows the practitioner to identify which original variables are most relevantto describe the data. To this end, using Roweis’ probabilistic interpretation of PCA and an isotropic Gaussian prior on the loading matrix, we provide the first exact computation of the marginal likelihood of a Bayesian PCA model. Moreover, in order to avoid the drawbacks of discrete model selection, a simple relaxation of this framework is presented. It allows to find a path of candidate models using a variational expectation-maximization algorithm. The exact marginal likelihood can eventually be maximized over this path, relying on Occam’s razor to select the relevant variables. Since the sparsity pattern is common to all components, we call this approach globally sparse probabilistic PCA (GSPPCA). Its usefulness will be illustrated on synthetic data sets and on several real unsupervised feature selection problems coming from signal processing and genomics.

*16:15-17:20 Lucia Paci (U. Cattolica Milano) Structural learning of contemporaneous dependencies in graphical VAR models [discussant: Edward George (U. Pennsylvania)]*

ABSTRACT: We propose an objective Bayes approach based on graphical models for learning contemporaneous dependencies among multiple time series within the framework of Vector Autoregressive (VAR) models. We show that, if the covariance matrix is Markov with respect to a decomposable graph which is fixed over time, then the likelihood of a graphical VAR can be factorized as an ordinary decomposable graphical model. Additionally, using a fractional Bayes factor approach, we obtain the marginal likelihood in closed form and construct an MCMC algorithm for Bayesian graphical model determination with limited computational burden. We apply our method to study the interactions between multiple air pollutants over the city of Milan (Italy).

##### Tuesday, 2 July

*9:00-10:05 Rui Paulo (U. Lisboa) Variable selection in the presence of factors: a model selection perspective [discussant: Anabel Forte (U. Valencia)]*

Abstract: Factors are categorical variables, and the values which these variables assume are called levels. In this paper, we consider the variable selection problem where the set of potential predictors contains both factors and numerical variables. There are two possible approaches to variable selection: the estimation-based and the model-selection-based. In the former, the model containing all the potential predictors is estimated and a criterion for excluding variables is devised based on the estimate of the associated parameters. Here, sparcity must be encouraged either via the prior on the parameter vector or some other form of penalization. In the the latter, all $2^p$ models are considered, and variable selection is based on the posterior distribution on the model space.

In this paper, we approach the variable selection problem in the presence of factors via the model selection perspective. Formally, this is a particular case of the standard variable selection setting where factors are coded using dummy variables. As such, the Bayesian solution would be straightforward and, possibly because of this, the problem, despite its importance, has not received much attention in the literature. Nevertheless, we show that this perception is illusory and that in fact several inputs like the assignment of prior probabilities over the model space or the parameterization adopted for factors may have a large (and difficult to anticipate) impact on the results. We provide a solution to these issues that extends the proposals in the standard variable selection problem and does not depend on how the factors are coded using dummy variables. Our approach is illustrated with a real example concerning a childhood obesity study in Spain.

*10:05-11:10 Michael Evans (U. Toronto) The measurement of statistical evidence as the basis for statistical reasoning [discussant: Martine Barons (U. Warwick)]*

There are various approaches to the problem of how one is supposed to conduct a statistical analysis.

Different analyses can lead to contradictory conclusions in some problems so this is not a satisfactory state of affairs. It seems that all approaches make reference to the evidence in the data concerning questions of interest as a justification for the methodology employed.

It is fair to say, however, that none of the most commonly used methodologies is absolutely explicit about how statistical evidence is to be characterized and measured. We will discuss the general problem of statistical reasoning and the development of a theory for this that is based on being precise about statistical evidence. This will be shown to lead to the resolution of a number of problems.

*11:35-12:40 Dimitris Fouskakis (National Technical U. Athens) All about PEP [discussant: Mark Steel (U. Warwick)]*

One of the main approaches used to construct prior distributions for objective Bayes model selection is the concept of random imaginary observations. The power-expected-posterior (PEP) prior was recently introduced in order to alleviate the amount of information introduced by the size of the training dataset. In this talk we initially present an overview of the methods used for constructing objective priors for model selection. We then present the PEP methodology for the variable selection problem under normal linear models. The theoretical properties of the prior are discussed and focus is given on the consistency of the Bayes factor, under different power parameters, when the dimension of the full model can also increase. We show that the PEP prior can be represented as a mixture of g-prior, like a wide range of prior distributions under normal linear models, and thus posterior distributions and Bayes factors can be derived in closed form, keeping therefore computational tractability. Comparisons with other mixtures of g-prior are made and emphasis is given in Bayesian model average estimation. Additionally, an appealing idea based on sufficiency is presented, aiming to further reduce computational cost. We also discuss shrinkage type PEP priors in the case that the number of parameters is larger than the sample size. We then move to the broader framework of generalized linear models, were different versions of the PEP prior are introduced and compared according to their properties and behavior in simulated examples. Finally, we present a recent application of the PEP methodology in the model selection problem of undirected decomposable Gaussian graphical models.