Skip to content

tmpethick/ensembled-dngo

Repository files navigation

Ensembled Deep Network for Global Optimization

This includes the code and results for the Ensembled Deep Network for Global Optimization with the purpose of reproducibility. For a proper introduction to the model and results please see the accompanying write-up.

Note that in order to create the results several days were used across ca. 50 CPUs. For convenience the regret history together with the associated samples have been included so that the behavior of models can be explored through a Jupyter notebook.

Most of the library is self-contained (apart from the usual dependencies like NumPy and PyTorch) -- everything from the bayesian linear regressor to the bayesian optimization procedure is implemented from scratch. The only two sub-routines of the computational heavy tasks that have been outsourced to external libraries are the MCMC implementation and the cholesky decomposition.

The repository consist of:

  • a Jupyter Notebook: to recreate paper plots, explore results from experiments, run new experiments interactively.
  • HPC: a workflow for running experiments in parallel on a HPC that uses the IBM LSF batch system.
  • Paper: which describes the model and results.
  • RoBO and Spearmint: a docker instance and script, respectively, to run benchmark models.

Code Outline

The code (i.e. everything in src/) is roughly divided into four parts:

  • BO
  • Models
    • Linear Bayesian Regression
    • Deep Neural Network
    • DNGO
    • Ensemble DNGO model
  • Benchmark
    • Embedding
    • Hyperparameter optimization of Logistic Regression
  • Priors

Explore Results (Jupyter Notebook)

Note: First consolidate the installation steps before attempting to use the notebook.

The notebook found at ./notebook.ipynb served three purposes:

  • Recreation of the plots from the write-up.
  • Exploration the acquisition landscape and regret plot for a given experiment from the write-up.
  • Running a new model programmatically.
    (every configuration is done through a shared interface with run.py to ensure reproductivity and allow for calculating a confidence interval based on the aggregated result of identical model configurations).

Installation

  • Conda requirement for server:

    wget https://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh -O ~/miniconda.sh
    bash miniconda.sh
  • General requirements:

    conda create -n eth python=3.6
    source $HOME/miniconda/bin/activate
    source activate eth
    conda install -y pytorch-cpu torchvision-cpu -c pytorch # gpu: conda install pytorch torchvision -c pytorch
    conda install -y -c conda-forge blas seaborn scipy matplotlib pandas gpy pathos emcee
    pip install pydot-ng
    git clone https://github.com/automl/HPOlib2.git
    cd HPOlib2
    for i in `cat requirements.txt`; do pip install $i; done
    for i in `cat optional-requirements.txt`; do pip install $i; done
    python setup.py install
    cd ..
  • Notebook requirements:

    conda install -c conda-forge ipympl
    conda install jupyterlab nodejs
    jupyter labextension install @jupyter-widgets/jupyterlab-manager
  • Plotly requirements for jupyterlab:

    jupyter labextension install @jupyterlab/plotly-extension

Environment

The project uses .autoenv.zsh to activate the correct python virtual environment. By default it uses the name eth which can be modified by changing .autoenv.zsh.

Executing Experiment

Please consolidate ./run.py to see available commands.

Every model configuration is done through a declarative interface to ensure reproductivity and allow for calculating a confidence interval based on the aggregated result of identical model configurations.

If it is required to run the experiment programmatically instead, please see the notebook. This also illustrates how to recreate the models for already run experiments.

Executing on Server

The setup is specifically tailored to the Euler and Leonard cluster at ETH which uses the IBM LSF batch system. To setup the environment follow this great guide by Tom Stesco.

Whereas a model can be run locally by sending parameters to run.py, running it on the server requires running it through a make script that ensures proper synchronization. It automates three steps: 1) pushing code and training data to the server 2) running run.py remotely and 3) pulling the results back down so they can be explored in the interactive notebook.

  • make pushdata: It will push ./raw and ./processed to the server which contains training data for the hyperparameter optimization benchmarks. This is required since the servers have blocked access to the network by default. To generate these folder locally first, run one of the models locally by executing python run.py.
  • make push: Used to synchronize code changes.
  • make ARGS="--model gp --n_iter 2" run: Run this experiment on the server which in the particular case executes BO for two steps using a GP model.
  • make pull: To pull generated plots and merge the remote CSV database with the local version.

These are all made for Euler. For Leonhard use the namespaced version: make pushdata-leonhard, make push-leonhard, make run-leonhard and make pull-leonhard.

Currently run experiments

The currently run experiments are listed in tests_euler.txt and tests_leonhard.txt for reproductivity and to provide inspiration for possible model configurations. The experiments are split in two files so that models benefitting from GPU acceleration can be run on the GPU enabled Leonhard cluster.

Execute Tests

Run with make test. Note that currently the end-2-end test requires you to evaluate the plot by sequentially closing them.

Comparisons

As comparison Spearmint and RoBO.

RoBO

To run in a linux environment a Docker instance have been created in ./RoBO with the following Makefile commands:

  • make build Build docker image.
  • make run Run container based on image that starts jupyter lab on port 8888. (mounts /RoBO/shared)
  • make stop Stop container.
  • make start Start stopped container.
  • make clear Delete both container and image.

Spearmint

What follows is a summary of how to use Spearmint (which requires that mongodb is installed). We assume that ./spearmint is your working directory.

Install (will download to your current directory):

git clone https://github.com/HIPS/Spearmint
pip install -e ./Spearmint

Run spearmint:

mongod --fork --logpath ./log/mongodb.log --dbpath /usr/local/var/mongodb
python ./Spearmint/spearmint/main.py .

Quit the mongodb daemon:

  1. Find the pid with top | grep mongo
  2. Kill the process with kill <pid>.

To clear mongodb:

mongo
use spearmint
db['<experiment_name>.jobs'].remove({status:'pending'})

Plot:

python spearmint_plots.py .

Resources

Introductory:

Papers: