Kernel Interpolation for Scalable Online Gaussian Processes

This repository contains a gpytorch implementation of WISKI (Woodbury Inversion with SKI) from the paper

by Samuel Stanton, Wesley J. Maddox, Ian Delbridge, Andrew Gordon Wilson

🥃

Introduction

While Gaussian processes are the gold standard for calibration and predictive performance in many settings, they scale at least $\mathcal{O}(n),$ where $n$ is the number of data points. We show how to use structured kernel interpolation (SKI) to efficiently reuse computations to produce constant time (in $n$) updates to the posterior distribution, while retaining the exact inference formulation (no variational objectives) of Gaussian processes.

Installation

To replicate our experiments, you'll need to simply install the package:

git clone https://github.com/wjmaddox/online_gp.git
cd online_gp
pip install -r requirements.txt
pip install -e .

Exploration of Different Types of Online Approximations

We've included an exploration and tutorial of different types of online approximate Gaussian processes (WISKI, Online SVGPs, and Online SGPR) in this notebook. We'd highly encourage the reader to start there to understand the differences between types of data observed in the streaming setting (whether iid data or time series formatted data).

Streaming Regression and Classification Experiments

The UCI regression and classification experiments require an additional data storage package for logging:

git clone https://github.com/samuelstanton/upcycle.git
pip install -e upcycle/

These experiments use Hydra to manage configuration. Every field in the config/*.yaml files can be overridden from the command line.

Regression

python experiments/regression.py

Important options

model=(exact_gp_regression, svgp_regression, sgpr_regression, wiski_gp_regression)
dataset=(skillcraft, powerplant, elevators, protein, 3droad)
stem=(eye, linear, mlp)

Classification

python experiments/classification.py

Important options

model=(exact_gpd, svgp_bin, wiski_gpd)
dataset=(banana, svm_guide_1)
stem=(eye, linear, mlp)

Logging

By default your experimental results will be saved as csv files in data/experiments/<exp_name>. If you have an Amazon AWS command line interface (CLI) configured you can modify config/logger/s3.yaml and use the option logger=s3 to log results in a specified S3 bucket.

Bayesian Optimization and Active Learning

Our bayesian optimization and active learning experiments are built off of Botorch and use standard bayesian optimization loops as in their tutorials.

Bayesion Optimization

cd experiments/bayesopt/
python bayesopt.py --model=wiski --cuda --cholesky_size=1001 \
    --dim=3 --acqf=ucb --function=Ackley \
    --noise=4.0 --num_steps=1500 --batch_size=3 --seed=0 \
    --output=results.pt

Active Learning

The malaria dataset can be downloaded from here. Use the --data_loc to load the file in from where you downloaded it.

cd experiments/active_learning/

#### wiski and exact experiments with qnIPV
python qnIPV_experiment.py --cuda --batch_size=6 --num_steps=500 --model=exact
python qnIPV_experiment.py --cuda --batch_size=6 --num_steps=500 --model=wiski

##### osvgp experiments with osvgp's minimum posterior variance
# random
python mpv_osvgp.py --cuda --batch_size=6 --num_steps=500 --seed=0 --acqf=random --lr_init=1e-4 --output=svgp_random.pt

# maximum posterior variance
python mpv_osvgp.py --cuda --batch_size=6 --num_steps=500 --seed=0 --acqf=max_post_var --lr_init=1e-4 --output=svgp_mpv.pt

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
config		config
data/experiments		data/experiments
experiments		experiments
notebooks		notebooks
online_gp		online_gp
scripts		scripts
tests		tests
.DS_Store		.DS_Store
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Kernel Interpolation for Scalable Online Gaussian Processes

Introduction

Installation

Exploration of Different Types of Online Approximations

Streaming Regression and Classification Experiments

Regression

Classification

Logging

Bayesian Optimization and Active Learning

Bayesion Optimization

Active Learning

About

Releases

Packages

Languages

License

wjmaddox/online_gp

Folders and files

Latest commit

History

Repository files navigation

Kernel Interpolation for Scalable Online Gaussian Processes

Introduction

Installation

Exploration of Different Types of Online Approximations

Streaming Regression and Classification Experiments

Regression

Classification

Logging

Bayesian Optimization and Active Learning

Bayesion Optimization

Active Learning

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages