This repository contains the code and resources associated with the ISMIR 2019 paper:
Siddharth Gururani, Mohit Sharma, Alexander Lerch. An Attention Mechanism for Musical Instrument Recognition. (To appear) In Proceedings of the International Society of Music Information Retrieval, ISMIR 2019.
Before executing any code, please download the data from here. You should then place train.npz and test.npz in the data folder.
Alternatively, you may download the OpenMIC dataset and use the tool data/data_split.py
to generate the dataset splits.
You need to have Pytorch, TensorboardX, Tqdm, Deepcopy
installed in your python environment. We will update the repo with a conda environment file for easy setup.
By default the code assumes the presence of a GPU. We will add a device-agnostic version of the code in future commits.
The commands in the multirun_commands.txt
file were used to train the various models with different random seeds. If you are only interested in the attention model, that can be found in Attention.py
. The baseline models are implemented in model.py
.
Our thanks to Qiuqiang Kong for their implementation of the attention model from their paper:
Qiuqiang Kong, Yong Xu, Wenwu Wang and Mark D. Plumbley. Audio Set classification with attention model: A probabilistic perspective. In: International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2018, Calgary, Canada, 15-20 April 2018.