Gaussian Process Classification

The second group project I worked on at COMPASS mainly involved learning how Gaussian process classification worked, as it is a complicated procedure, and not as straight forward as Gaussian process regression.

Our work involved a number of aspects that have improved on Gaussian process classification in recent literatures:

  • Pseudo-Marginal Likelihood: An importance sampling procedure to approximate the marginal likelihood in MCMC sampling.
  • Subset Selection: An entropy based measure that chooses a subset of a full dataset that maximises information across the dataset, referred to as the Information Vector Machine (IVM).
  • Laplace Approximation: An approximation of the posterior of the latent variables.

Together, these approximations makes Gaussian process classification feasibile. Without approximations such as these, the procedure would have an incredible runtime.

Finally, we compared the results on an e-mail spam dataset, and had a higher prediction accuracy than a JAGS implementation of logistic regression. We combined our code, written in Rcpp, into an R package, available here.

Daniel Williams
Daniel Williams
CDT Student

I am a PhD student studying at the University of Bristol under the COMPASS CDT, and previously studied at the University of Exeter. My research is currently regarding truncated density estimation, and unnormalised models. But I am also interested in AI more generally, including all the learnings, Machine, Deep and Reinforcement (as well as some others!).