CATE Benchmark

A testing platform to assess the performance of CATE estimators across popular datasets.

Installation

The easiest way to replicate the running environment is through Anaconda. Once installed, follow the steps below.

Download the repo.
Enter the directory (i.e. cd cate-benchmark).
Run the following command to recreate the 'cate-bench' conda environment:

conda env create -f environment.yml

Download datasets from here. Once downloaded, extract them to 'datasets' directory.

Usage

The 'experiments' folder contains a few example scripts that run the code, ranging from basic to more advanced. To learn more about available script parameters, see the contents of 'main.py'.

By default, any files created as part of running the scripts are saved under 'results'.

Example - 'basic'

Go to 'experiments' and run the basic script:

bash basic.sh

This script tests the Lasso model against one iteration of the IHDP data set. You should see relevant metrics and the performance obtained by the estimator printed in the console. In the same script, it is easy to change the number of iterations, the data set or the estimator.

Example - 'advanced'

Go to 'experiments' and run the advanced script:

bash advanced.sh

This script covers more estimators and 10 iterations of IHDP. Once it's done, you should see a summary similar to the following one:

You can find the content of the summary in 'results/combined.csv'.

Example - 'extensive'

This script tests almost all estimators against all four data sets. Depending on the computational power of your machine, this script may take days or even weeks to complete. To run the script, go to 'experiments' and run:

bash extensive.sh

Analysing results

A separate directory is created per each selected estimator when running the scripts to store various results. The following result files are usually created:

info.log (intermediate results as the script is being executed)
scores.csv (final scores per relevant metric)
times.csv (training prediction time in seconds consumed by an estimator)

In addition, it is possible to get a summary of multiple estimators in a single table. This can be done via the 'results/process.py' script, which in turn produces 'combined.csv' file. For an example usage, see some of the existing running scripts.

Adding other estimators

The code can be easily extended to use more estimators.

Go to 'main.py'.
Edit 'get_parser()': add new key to 'estimation_model'.
Edit '_get_model()': using the new key, return an instance of your model.
Edit 'estimate(): train your model on the data and provide predictions.

Other projects

Projects using the CATE benchmark:

Undersmoothing Data Augmentation

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
datasets		datasets
experiments		experiments
models/data		models/data
results		results
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
cate_bench_advanced.png		cate_bench_advanced.png
environment.yml		environment.yml
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CATE Benchmark

Installation

Usage

Example - 'basic'

Example - 'advanced'

Example - 'extensive'

Analysing results

Adding other estimators

Other projects

About

Releases 1

Packages

Languages

License

misoc-mml/cate-benchmark

Folders and files

Latest commit

History

Repository files navigation

CATE Benchmark

Installation

Usage

Example - 'basic'

Example - 'advanced'

Example - 'extensive'

Analysing results

Adding other estimators

Other projects

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages