Applied analysis of Bayesian inverse problems

During the last few decades the Bayesian methodology has attracted a lot of attention in the inverse problems community (see the article by Stuart or the book by Kaipio and Somersalo for an introduction). The goal of this approach is to infer an unknown parameter from a set of noisy indirect measurements. We start by identifying a forward map that models the measurements as well as a prior probability measure that models our prior knowledge of the parameter. We then combine the forward map and the prior measure to obtain a posterior probability measure that is regarded as the solution to the inverse problem.

A particularly challenging setting for Bayesian inverse problems is when the parameter belongs to an infinite dimensional Banach space. This is the case in inverse problems where the forward map involves the solution of a partial differential equation (PDE). A key question in infinite dimensional Bayesian inverse problems is that of well-posedness: Is the solution to the inverse problem well-defined and does it depend continuously on the data?

I am interested in the issue of well-posedness when the prior measure is not Gaussian and has heavy-tails. I have studied the cases of non-Gaussian priors with exponential tails and infinitely divisible priors. For more information see the following articles: Link 1 and Link 2.

MCMC in infinite dimensions

The main challenge in practical applications of the Bayesian methodology for parameter estimation is the extraction of information from the posterior probability measure. The main workhorse of the Bayesian framework in this context is the Markov Chain Monte Carlo (MCMC) method. I am interested in infinite-dimensional problems which, in practice are approximated by a high-dimensional problem that results from discretization. Furthermore, in some cases the priors that are used in the inverse problem are highly non-Gaussian (such as the compressible priors above). Due to these attributes, most conventional MCMC algorithms become highly inefficient in sampling from posteriors that arise from non-Gaussian priors.

Recently, Cotter et al. introduced several classes of MCMC algorithms that are reversible in infinite dimensions but rely on the assumption that the prior measure is a Gaussian. I am interested in the design of algorithms that are well-defined on infinite dimensional spaces and can by applied to non-Gaussian priors.  A particularly interesting case of such prior measures is the class of self-decomposable priors that includes the Besov prior of Lassas et al.  In the manuscript [link to arXiv] we introduce two classes of Metropolis-Hastings algorithms using autoregressive type proposals that have the above mentioned properties and are tailor made to non-Gaussian priors.

Sampling the Laplace prior using a variation of the random walk for self-decomposable priors.
Trace plot of the Laplace sampler.

Compressible priors

Estimation of sparse parameters is a central problem in different areas such as compressive sensing, inverse problems and statistics and has wide applications in image compression, medical and astronomical imaging and machine learning. I am interested in the case where the compressible parameter of interest belongs to an infinite dimensional Banach or Hilbert space. My goal is to develop a framework for estimation of compressible parameters as well as the uncertainties that are associated with the estimated values.

I am focusing on the choice of appropriate prior measures that can model compressible parameters. In the manuscript “Well-posed Bayesian inverse problems with infinitely divisible and heavy-tailed priors measures”. I studied the theoretical aspects of Bayesian inverse with heavy-tailed priors and laid the groundwork for the analysis of Bayesian inverse problems with compressible priors. I also introduced an instance of a class prior measures to model compressibility.

A non-convex prior resulting in a compressible posterior.
A sample from a prior with compressible realizations in Fourier domain.
Samples from a compound Poisson process prior with countably many jumps for edge preserving recovery of signals.

Semi-supervised learning in the Continuum limit

Graphical Semi-supervised learning (SSL) is the problem of inferring the labels of the vertices of a graph from the labels of a small number of vertices along with additional information regarding the similarities of the points. The SSL problem is often cast in the graphical setting where a weighted graph.

My research in this field is concerned with understanding the properties of the probit method for SSL in the large data limit where the number of vertices goes to infinity. Dunlop et al.  showed that in this case the probit SSL problem converges to an optimization problem in the continuum limit. We analyze this continuum limit and characterise its solution. In particular we show that under general conditions the probit classifier is dominated by the first few eigenfunctions of the graph Laplacian which converges to an elliptic differential operator in the continuum limit. We also show that under certain conditions as the measurement noise goes to zero probit SSL will successfully recover the correct missing labels.

Continuum limit of spectral clustering

Clustering is a closely related problem to SSL where the aim is to find meaningful coarse structure in a dataset (see the review paper by Schaefer). A particularly successful approach for this task is spectral clustering. Here few eigenvalues of the graph Laplacian operator are used to construct a mapping that embeds the data in a low dimensional space. Then, an algorithm such as k-NN is used to cluster the embedded data more efficiently.

In a similar manner to probit SSL, spectral clustering also has a continuum analog as the size of the dataset goes to infinity. In the upcoming publication we study this continuum limit. We show that if the data points are randomly distributed according to a measure which is a mixture of “well-separated components” then spectral clustering is successful and identifies the correct coarse structure in the dataset with high probability.
Generic proximity graph on a dataset consisting of two clusters.
Second eigenvector of the graph Laplacian (Fiedler vector) of a dense graph. Value of this eigenvector naturally splits the data into two clusters.

Regularization of singular source terms

I am interested in consistent regularizations of singular source terms for the numerical solution of PDEs. This project is motivated by the need to regularize singular source terms in a range of applications such as the immersed boundary and level set methods. Previously, I studied the smooth regularizations of the delta Distribution (link to article). I introduced a general framework for construction of such regularizations and studied their convergence to the Dirac delta distribution in different topologies. Currently I am working on extending those results to the case of sources that are supported on a manifold such as line sources.

Some regularizations of the delta distribution in one and two dimensions.
Solution of the wave equation using tensor product and radial regularizations.
Solution of the KDV equation using a regularized point source initial condition with different support sizes.