The implementation of SiZeLet (SZL) proposed in the paper Everything Can Be Embedded by Zhili Shen.
- Create
./output
,./cache
,./saved_models
under the project folder - Download the Product Fit dataset from this link, and unzip it in the project folder.
- Download the GloVE Word Embedding from this link, move it to the
./cache
folder after unzipping glove.twitter.27B.zip.
- You can configure SiZeLet in
config.py
, SiZeLet has these options:
- train_test_proportion: proportion of training set and validation set.
- TextCNNorBiRNN: use textCNN or BiRNN.
- max_length_sentence: the length of the longest sentence.
- min_frequency: words with word frequency lower than this number will be deleted.
- use_pretrained_model: whether to use a pre-trained model.
- user_embedding_dim: user discrete attribute embedding dimension.
- item_embedding_dim: item discrete attribute embedding dimension.
- review_embedding_dim: review discrete attribute embedding dimension.
- kernel_sizes: textCNN convolution kernel size.
- num_channels: textCNN channel size.
- num_hidden: BiRNN hidden layer dimension.
- num_layers: BiRNN hidden layer number.
- lr: learing rate.
- Run
train.py
file
python train.py
- The result will appear in
./output/output.txt
F1-score | Accuracy | AUC |
---|---|---|
0.713 | 0.831 | 0.886 |