An awesome script for matching words of one document to other documents in order to detect plagiarism!
Live Demo
·
Report Bug
·
Request Feature
Table of Contents
Find similarities on a .txt
file, given several other .txt
files, to detect plagiarism.
- HTML5
- Bootstrap 5
- JavaScript
- Multiple pattern matching with a Trie (In future releases will be used a better algorithm, probably Aho-Corasick)
-
Clone the repo
git clone https://github.com/ivaste/pattern_matching.git
-
Convert your
.pdf
files in.txt
with https://pdftotext.com/ (In future releases this step will be automated) -
Open the
index.html
file with your browser -
Drag&Drop your reference
.txt
files in the Reference File Box -
Drag&Drop the
.txt
file in the File to Check Box -
Click on the
Find Similarities
button
Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are greatly appreciated.
- Fork the Project
- Create your Feature Branch (
git checkout -b feature/AmazingFeature
) - Commit your Changes (
git commit -m 'Add some AmazingFeature'
) - Push to the Branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Stefano Ivancich - stefano-ivancich
Project Link: https://github.com/ivaste/pattern_matching
- Multiple pattern matching algorithm with Trie
- Check if the user uploaded the right file type (.txt)
- Convert automatically from PDF (or other) to txt. link1
- Disable Button when click on it
- Progress bar when calculating. link1 link2-promise
- Remove non ascii chars
- Better pattern matching algorithm (aho corasick)
- Let user choose window_size
- Multiple matching with different windows sizes and then combine the results
- Re-Do everithing with react