This repository contains all models and adaptations needed to run Multi-GNN for Anti-Money Laundering. The repository consists of four Graph Neural Network model classes (GIN, GAT, PNA, RGCN) and the below-described model adaptations utilized for financial crime detection in Egressy et al.. Note that this repository solely focuses on the Anti-Money Laundering use case. This repository has been created for experiments in Provably Powerful Graph Neural Networks for Directed Multigraphs [AAAI 2024] and Realistic Synthetic Financial Transactions for Anti-Money Laundering Models [NeurIPS 2023].
To use the repository, you first need to install the conda environment via
conda env create -f env.yml
Then, the data needed for the experiments can be found on Kaggle. To use this data with the provided training scripts, you first need to perform a pre-processing step for the downloaded transaction files (e.g. HI-Small_Trans.csv
):
python format_kaggle_files.py /path/to/kaggle-files/HI-Small_Trans.csv
Make sure to change the filepaths in the data_config.json
file. The aml_data
path should be changed to wherever you stored the formatted_transactions.csv
file generated by the pre-processing step.
To run the experiments you need to run the main.py
function and specify any arguments you want to use. There are two required arguments, namely --data
and --model
. For the --data
argument, make sure you store the different datasets in different folders. Then, specify the folder name, e.g --data Small_HI
. The --model
parameter should be set to any of the model classed that are available, i.e. to one of --model [gin, gat, rgcn, pna]
. Thus, to run a standard GNN, you need to run, e.g.:
python main.py --data Small_HI --model gin
Then you can add different adaptations to the models by selecting the respective arguments from:
Argument | Description |
---|---|
--emlps |
Edge updates via MLPs |
--reverse_mp |
Reverse Message Passing |
--ego |
Ego ID's to the center nodes |
--ports |
Port Numberings for edges |
python main.py --data Small_HI --model gin --emlps --reverse_mp --ego --ports
There are several arguments that can be set for additional functionality. Here's a list with them:
Argument | Description |
---|---|
--tqdm |
Displays a progress bar during training and inference. |
--save_model |
Saves the best model to the specified model_to_save path in the data_config.json file. Requires argment --unique_name to be specified. |
--finetune |
Loads a previously trained model (with name given by --unique_name and stored in model_to_load path in the data_config.json ) to be finetuned. |
--inference |
Loads a previously trained model (with name given by --unique_name and stored in model_to_load path in the data_config.json ) to do inference only. |
Apache License Version 2.0, January 2004