Skip to content

ColdRec: An Open-Source Benchmark Toolbox for Cold-Start Recommendation.

License

Notifications You must be signed in to change notification settings

YuanchenBei/ColdRec

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

27 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation


❄️ ColdRec 🔥

☃️ ColdRec is a comprehensive open-source toolkit and benchmark for cold-start recommendation. In coldrec, models follow a unified pipeline, the datasets follow a unified division, and tasks include cold user/item recommendation, warm user/item recommendation, and overall user/item recommendation, targeting at providing the community with a comprehensive and fair benchmark evaluation for cold-start recommendation.

🔧 Information in 2024.11: Thanks for the usage! There are still many problems that can be improved in this codebase, such as the function or unclear descriptions mentioned in the issues. Due to recent busyness, sorry that I am unable to reply carefully to each message one by one. I would like to improve this codebase based on the feedback on the issues in February or March 2025.

🥳 Update in 2024.06: Add automatic hyper-parameter tuning, you can install one more base library optuna to include this module.


🛫 Requirements

ColdRec hopes to avoid the complicated and tedious packaging process and uses native pytorch and a small number of necessary libraries to build the codebase.

python >= 3.8.0 
torch >= 1.11.0
faiss-gpu >= 1.7.3
pandas >= 2.0.3
numba >= 0.58.1 
numpy >= 1.24.4
scikit-learn >= 1.3.2
pickle >= 0.7.5
optuna >= 3.6.1 (If you need automatic hyper-parameter tuning) 

🚀 Quick Start

1️⃣ Dataset Preprocess

We have provided the preprocessed datasets in Google Drive. You can download the dataset from the Google Drive link directly, unzip it, and place it into the ./data folder. Then, you can process it into the format for model training with simple two scripts, of which the details can be found in the page of dataset details.

2️⃣ Warm Embedding Pre-Training

This step is not necessary, but since most of the work requires this step, it is recommended to complete this step to fully evaluate all models. We have provided both widely adopted collaborative filtering modle MF and graph-based model LightGCN as warm recommender. You can obtain the pre-trained warm user/item embeddings with the following two options:

Options 1: You can directly access the BPR-MF pre-trained embeddings at the Google Drive. Then, the embedding folder associated to each dataset should be placed in the ./emb folder.

Options 2: You can also pre-train the warm embedding by yourself. Specifically, you can simply obtain the pre-trained warm embeddings by running the following script:

python main.py --dataset [DATASET NAME] --model [WARM RECOMMENDER] --cold_object [user/item]

In the above script, the [DATASET NAME] for --dataset should be replaced by your target dataset name, such as movielens. Then, the [WARM RECOMMENDER] for --model should be selected as the warm recommender type (MF, LightGCN, NCL, SimGCL, and XSimGCL). Finally, the [user/item] for --cold_object should be selected as user or item for the user cold-start setting or item cold-start setting.

3️⃣ Cold-Start Model Training and Evaluation

Coming to this step, you can start to train the cold-start model with one script:

python main.py --dataset [DATASET NAME] --model [MODEL NAME] --cold_object [user/item]

In the above script, the [MODEL NAME] for --model is the expected model name, where we have provided 20 representative models as the Supported Models. You can also flexibly register your own model with the ColdRec framework for evaluation.

4️⃣ (Option) Automatic Hyper-parameter Tuning

ColdRec also supports automatic hyper-parameter tuning. You can tune hyper-parameters with optuna with one script:

python param_search.py --dataset [DATASET NAME] --model [MODEL NAME] --cold_object [user/item]

You can flexibly set the tuning range in param_search.py.


🧸 Supported Models

ID Paper Model Venue
1 BPR: Bayesian Personalized Ranking from Implicit Feedback BPR-MF UAI 2009
2 Deep Content-based Music Recommendation DeepMusic NeurIPS 2013
3 Social Collaborative Filtering for Cold-start Recommendations KNN RecSys 2014
4 Learning Image and User Features for Recommendation in Social Networks DUIF ICCV 2015
5 VBPR: Visual Bayesian Personalized Ranking from Implicit Feedback VBPR AAAI 2016
6 DropoutNet: Addressing Cold Start in Recommender Systems DropoutNet NeurIPS 2017
7 Adversarial Training Towards Robust Multimedia Recommender System AMR TKDE 2019
8 Warm Up Cold-start Advertisements: Improving CTR Predictions via Learning to Learn ID Embeddings MetaEmbedding SIGIR 2019
9 LightGCN: Simplifying and Powering Graph Convolution Network for Recommendation LightGCN SIGIR 2020
10 How to Learn Item Representation for Cold-Start Multimedia Recommendation? MTPR MM 2020
11 LARA: Attribute-to-feature Adversarial Learning for New-item Recommendation LARA WSDM 2020
12 Recommendation for New Users and New Items via Randomized Training and Mixture-of-Experts Transformation Heater SIGIR 2020
13 Contrastive Learning for Cold-Start Recommendation CLCRec MM 2021
14 Improving Graph Collaborative Filtering with Neighborhood-enriched Contrastive Learning NCL WWW 2022
15 Are Graph Augmentations Necessary? Simple Graph Contrastive Learning for Recommendation SimGCL SIGIR 2022
16 Generative Adversarial Framework for Cold-Start Item Recommendation GAR SIGIR 2022
17 XSimGCL: Towards Extremely Simple Graph Contrastive Learning for Recommendation XSimGCL TKDE 2023
18 GoRec: A Generative Cold-start Recommendation Framework GoRec MM 2023
19 Contrastive Collaborative Filtering for Cold-Start Item Recommendation CCFRec WWW 2023
20 Aligning Distillation For Cold-start Item Recommendation ALDI SIGIR 2023

About

ColdRec: An Open-Source Benchmark Toolbox for Cold-Start Recommendation.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages