A workflow using Alteryx, Python and Tableau to extract and analyse n-grams from a large set of raw email text.
- 'ngram.py' - Python algorithm for n-gram generation.
- 'ngrams.csv' - CSV file containing the generated n-grams.
- 'Parse ngrams.yxmd' - Alteryx workflow to parse and prepare generated n-grams, including dictionary check.
- Elegant n-gram generation in Python - A simple n-gram algorithm using zip()
- English-words - A text file containing 335k English words
- Counter class in Python - Used to calculate n-gram counts