An important problem in statistics is to retrieve
a function or a signal from noisy massive data. In contrast to the
orthonormal bases traditionally used in function estimation,
overcomplete (or redundant) representations have been advocated
due to their flexibility and adaption. Bayesian methods provide
several advantages in learning an overcomplete representation:
regularization is specified through priors; inferences on
hyperparameters are easily obtained via Markov Chain Monte Carlo;
probabilistic outcomes provide a full spectrum to summarize the
prediction or estimation.
In this talk, we focus on learning the overcomplete representation
of a reproducing reproducing kernel Hilbert space (RKHS). We
present a new, fully Bayesian framework and theory that
coherently embeds kernel regression/classification in a general
nonparametric model. The theory behind our approach relates the
model to statistical learning methods, showing the new class of
priors supports the full range of functions in RKHS. Key practical
features of our approach
include the use of shrinkage priors to address problems of ``large
p'', coherent updating as sample sizes change, and an
understanding of so-called ``unlabelled'' data. Several examples
illustrate the practical benefits of our new Bayesian kernel
framework.