FakeReviewDNN

Description:

This is an ULMFIT-based NLP approach on the detection of online Fake-reviews as an example to show the implementation of ULMFIT in Tensorflow and the usage of the pretrained model by Hubert Karbowy and the endrone team.

The implementation and documentation was done by: Tim Kapferer @TimKapf, Sofia Worsfold @fiabox and Tim Petersen @Antim8

To get a better understanding of the hows and whys take a look at our paper-like document

Prerequisites

Have a python environment with Tensorflow installed
Install the requirements.txt with pip
- be sure that your os and python version are supported for tensorflow_text, otherwise errors will occur
clone the repository by endrone into a folder of name "tf2_ulmit" otherwise the imports won't work
- and install their requirements aswell
Download our trained models from here and put them into the SavedModels folder
if you want to graphically inspect the generated logs install tensorboard and start it with specific logdir

Structure:

File/Folder	usage
logs	Here are logs stored for training runs
amazon.model	lm trained on amazon review data
new_amazon.model	bettered amazon model
main.py	The main file for the user to interact
utils.py	helper functions
model_util.py	functions that help to train ULMFIT
fake_review_dataset.csv	Our main dataset to train classifier
model.py	Our model wrapping the ULMFIT by edrone
rev_(clean)_data	fetched and cleaned Amazon review data of the official Tensorflow dataset
shortenSPM.model	shortened original lm from 35000 to around 4-5k sentencepieces

How it was trained

Check out the folder [Scientific background].

How to use it

Test reviews if they are real or bot written

The normal interaction would be to just run the main.py script and follow the instructions, but...

if you want to go through the whole process of training and gathering again follow the steps below:

1. Create amazon.model

Run the funtion train_sentencepiece_model in the util.py file to get a sentencepiece model of our amazon dataset.

2. Create shortened model

Run the function blablabla with amazon.model provided.

3. Merge the models

Run the function Code Discord -> new_amazon.model.

4. Create dataset for LM-fine tuning

Run prepare_for_generation in util.py (with rev_data.txt and new_amazon.model).

5. Fine-tune the language model

Run model.py and set classifier to false, to fine-tune the model on the created dataset The model will be saved in the Saved Model folder as fine_tuned_model.

6. Fine-tune/train the classifier

Run the model.py and set classifier to true. Use at least 11 epochs so gradual unfreezings works as intended. The classifier will be saved in the Saved Model folder as classifier_model.

7. Run the main file

and follow the instructions, just like in the beginning.

Name		Name	Last commit message	Last commit date
Latest commit History 106 Commits
ScientificBackground		ScientificBackground
ToAnalyze		ToAnalyze
logs		logs
saved_model		saved_model
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
amazon.model		amazon.model
fake_reviews_dataset.csv		fake_reviews_dataset.csv
get_fine_tuned_layers.py		get_fine_tuned_layers.py
main.py		main.py
model.py		model.py
model_util.py		model_util.py
new_amazon.model		new_amazon.model
new_variant.parquet		new_variant.parquet
requirements.txt		requirements.txt
rev_clean_data.parquet		rev_clean_data.parquet
rev_data.txt		rev_data.txt
shortenSPM.model		shortenSPM.model
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FakeReviewDNN

Description:

Prerequisites

Structure:

How it was trained

How to use it

Test reviews if they are real or bot written

1. Create amazon.model

2. Create shortened model

3. Merge the models

4. Create dataset for LM-fine tuning

5. Fine-tune the language model

6. Fine-tune/train the classifier

7. Run the main file

About

Releases

Packages

Contributors 2

Languages

License

Antim8/FakeReviewDNN

Folders and files

Latest commit

History

Repository files navigation

FakeReviewDNN

Description:

Prerequisites

Structure:

How it was trained

How to use it

Test reviews if they are real or bot written

1. Create amazon.model

2. Create shortened model

3. Merge the models

4. Create dataset for LM-fine tuning

5. Fine-tune the language model

6. Fine-tune/train the classifier

7. Run the main file

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages