hide
Free keywords:
-
Abstract:
Most existing sparse Gaussian process (g.p.)
models seek computational advantages by
basing their computations on a set of m basis
functions that are the covariance function of
the g.p. with one of its two inputs fixed. We
generalise this for the case of Gaussian covariance
function, by basing our computations on
m Gaussian basis functions with arbitrary diagonal
covariance matrices (or length scales).
For a fixed number of basis functions and
any given criteria, this additional flexibility
permits approximations no worse and typically
better than was previously possible.
We perform gradient based optimisation of
the marginal likelihood, which costs O(m2n)
time where n is the number of data points,
and compare the method to various other
sparse g.p. methods. Although we focus on
g.p. regression, the central idea is applicable
to all kernel based algorithms, and we also
provide some results for the support vector
machine (s.v.m.) and kernel ridge regression
(k.r.r.). Our approach outperforms the other
methods, particularly for the case of very few
basis functions, i.e. a very high sparsity ratio.